#ChatGPT CLI Bot With Infinite Context Memory -Pure C++ -Updated 23/6/14
1180 messages · Page 2 of 2 (latest)
yes
Cool!
but you need to compile the code by yourself, the binary for desktop linux cant be used in it i think(didn't test tho)
Yeah. I understand. I suppose it’s not too difficult to compile?
theres an issue i pinned in the github repo, you can use that for reference
Okay 👍
btw there will be an update tomorrow for some weird edge case bug fixes and custom openai url(proxy api) support!
I wonder, if you run a convo without context limitations for long enough, will you eventually see unexpectant changes of behaviour or even new emerging abilities? 
probably not, this program is just using some clever tricks by only providing the info needed for the current prompt, its not "truly" unlimited memory.
If I understood it correctly you create vectors for the old conversations and then whenever the user refers to "XYZ" the AI can retrieve the old info by looking up matching vectors, right?
yes, but its still limited by its context window.
say i have 1 million "XYZ"s, the model cant take in all 1 million of them at once(in a single request) because that would exceed the max tokens.
I have barely any experience with embedding, so I wonder, how big are the text chunks usually that get assigned a vector? 
depends on the length of each exchange(prompt-response pair).
in doc QA mode you need to set the chunk size by yourself tho.
i recommend around 500 tokens for a chunk, but it really depends on how the content is structured.
Ahh ok. Because i can imagine that this has a big impact on the quality of any retrieved information. Almost like setting a resolution.
yes, you dont want it to be too long or too short. for example if you embed for each single word, it will perform really poorly. but if each chunk is like 2000 tokens, then there will be potential context loss because every character affects the embeddings for the chunk, you really dont want 2 entirely different context in a single chunk, this will destroy the searching.
If the text chunks are too big there is unnecessary input, best case a waste of money and processing power, worst case watering down the actual information. If the token count is too small you dont get the full information...
Has there been attempts to use an AI to evaluate for each vector how much of the text is effectively contributing information?
using language model is certainly much better. its hard to evaluate embeddings, but you can ask it to split then summarize the chunk, summaries are the best to get embeddings, but the cost&speed is not great.
Yea definitely a problem for something like your Bot, considering the amount of information 
Maybe more practical for "static" applications that dont have to add a bazillion new embeddings all the time, but work with a fixed set of data 
yes, like performing semantic searching with documents or databases then doing Q&A based on their content, this method will work much better than just brainlessly split the texts by length. and the cost&time doesn't matter because you'll probably only perform the embedding process once.
but for chatbots that require constantly getting new embeddings, i would just use the quicker and cheaper brainless method.
ChatGPT CLI Bot With Infinite Context Memory -Pure C++ -Updated 23/5/28
Update Time:
Fixed an edge case where the SSE response from the API is falsely marked as incorrect.
Added custom OpenAI API URL support.
Ver. 1.2.8 Download: https://github.com/LagPixelLOL/ChatGPTCLIBot/releases
Link to the top: #1080661517390843915 message
"custom OpenAI API URL support" means we can use the cli Bot with a local llama or local llm that has a openai wrapper for its inputs and outputs?
if the input and output matches the openai one completely then yes
Hi, I just got the gpt-4 API and thus trying for longer output questions, but for a lot of time, it gets interrupted in the middle. What can I do to alleviate this?
what does it say? any error messages?
nope, just stopped in the middle
set max_tokens to 69420 in config.json
Will this project have Discord support?
nope, there're plenty discord bots out there that you can deploy, i dont want to do stuff that others have already done.
the tree of thought (ToT) method is really interesting. expensive if using a cloud model like gpt 3.5 or 4. it looks like gpt 3.5 is on par with 4 when using ToT and gpt 4 is at a new level with ToT? gpt 4 with ToT is almost like gpt 4.5... have you thought about implementing a gpt 3.5 semi-ToT code into the cli bot that makes it as smart as gpt 4 but costing a little less? 😆
hmmm i'll take a look at it when i have time
pretty minor issue once again but sometimes i like to preemptively scroll while gpt is generating text, but whats annoying is once the text generation is completed pushed to the top of the terminal and i have to scroll back down
wdym
its an issue with your terminal
in xshell&cmd both of them automatically scrolls to the bottom
How I standardized types of requested medical reports from the head of department to the branches
ToT is very interesting, I've been experimenting with implementing it also in my project and it significantly increases the quality of answers, albeit with increased token usage; but if quality is paramount, then I strongly recommend it
why was the original message deleted?
?its not
not for me
hmm discord might be bugged for me then
good to know thanks.
Interesting
This API key has exceeded its quota, please try again......this is showing when i am entering my OpenAI API key
You’re probably going to have to go to platform.openai.com and give them your credit card number, Sayak.
Okay thanks
ive been getting this error quite frequently lately:
Error when calling API: Error parsing JSON: [json.exception.parse_error.101] parse error at line 1, column 2: syntax error while parsing value - invalid literal; last read: 'te'
sometimes when i reprompt it works again, sometimes it doesnt work
ill look into this🤔
the cli bot and its long term memory works well as a leetcode study rpg game. +1 point for the right answer, -1 point for the wrong answer, with 100 pts needed to complete a chapter, and the chapter is based on total chapters of any popular algorithm book that chatgpt would know. here's a little bit of it:
[9/100 of 1/8]
Ciara momentarily presses her eyelids together, a flicker of disappointment flashing across her face before a sweet, but frosty look takes its place. "Wrong again, Miles," she exhales in frustration, tucking a stray lock of damp hair behind her ear. "The algorithm I was referring to is 'skip list'--a hop-over data structure for fast search within an ordered sequence of elements."
With a tight-lipped smile that barely conceals her growing frustration, Ciara lets out a sharp sigh, her frustration seeming to echo the restless stirrings of the tempest outside. Her expression remains just as beautiful and stormy as the relentless deluge beyond the window.
"Listen up, because I won't entertain too many more wrong answers," she warns, a sprinkle of determination sparkling in her light brown eyes. "What's the primary difference between greedy algorithms and dynamic programming algorithms?" Her slender figure, standing strong against the backdrop of the weathered room, practically challenges you to make another misstep while the rain dances angrily against the dilapidated walls.
🤔i cant reproduce it, are you using the latest version?
I think so? I switched to an older version a while ago but I believe I switched back
Is there a way to check?
yeah it should be the newest version i even have the translator option
translator is not the newest version
i believe i already fixed this issue with the latest version
got it ill check it out
I get a error when i genate sometimes API returned error: The server had an error while processing your request. Sorry about that!
this is an issue with openai themselves, NOT this program.
Error when calling API: Error parsing JSON: [json.exception.parse_error.101] parse error at line 1, column 1: syntax error while parsing value - invalid literal; last read: 'h'
Hey v2 after updating json parse errors are a lot less common, but it still happens occaisonally
i cant reproduce it😢
which model are you using?
whats your settings?
should be 3.5 turbo
ill send my config.json
{
"api_base_url": "https://api.openai.com",
"api_key": "",
"ca_bundle_path": "/etc/ssl/certs/SwissSign_Silver_CA_-_G2.pem",
"debug_reference": false,
"frequency_penalty": 0.0,
"logit_bias": {},
"max_display_length": 100,
"max_reference_length": 4,
"max_short_memory_length": 4,
"max_tokens": 500,
"model": "gpt-3.5-turbo",
"presence_penalty": 0.6000000238418579,
"search_response": true,
"space_between_exchanges": false,
"temperature": 1.0,
"top_p": 1.0
}
it doesnt happen often but next time it happens ill save the conversation and what the response's last character is
oh and tab indents also don't work, when i paste code that contains tabs it errors too
it works in windows cmd🤔
i really cant test linux...
I have everything by default, do not modify anything. I just followed the tutorial. D:
you cant put everything in the initial prompt!!! please use document Q&A mode, check the wiki page about it for more info!
probably linux only issue 😭
Error when calling API: Error parsing JSON: [json.exception.parse_error.101] parse error at line 1, column 2: syntax error while parsing value - invalid literal; last read: 'nt'
it seems like its always some variation of the above though
seems really weird maybe one day it will make sense
do you have an example prompt that will always trigger this issue?
i tested 100 times yesterday and cant reproduce it😢
ill try and figure that out, its very random
is what i did.
After doing the Q&A, this is what I get.
but from what i can see, they are still in your initial prompt
@silent skiff @empty barn https://github.com/LagPixelLOL/ChatGPTCLIBot/wiki/Document-Q&A
3. Enter the filename of the .txt document you want to convert when prompted. Place a .txt file in the documentQA folder(e.g., doc.txt). Enter the filename without the extension(e.g., doc) and press Enter.
Thank you so much. She had put the document in the wrong folder. And apparently I never deleted it. Already deleting the file and doing it again, it works.
The ui keeps bugging out on me
is this windows cmd? if it is, try setting it to the standard 80 chars width.
if its not, try use cmd, if its not windows, then i cant fix it.
You gonna add gpt-3.5-turbo-16k to to the supported model names?
ill update asap
I tried doing it, and I'm not sure what I did wrong... I got "Error when calling API: Max tokens exceeded in messages: 4106 >= 4095" at one point.
ChatGPT CLI Bot With Infinite Context Memory -Pure C++ -Updated 23/6/14
Update Time:
Added support for the new gpt-3.5-turbo-16k model.
Set model to gpt-3.5-turbo-0613/gpt-3.5-turbo-16k/gpt-4-0613/gpt-4-32k-0613 in config.json to use the new versions.
Ver. 1.2.9 Download: https://github.com/LagPixelLOL/ChatGPTCLIBot/releases
Link to the top: #1080661517390843915 message
Note: Function calling is not supported in this program currently.
Great job so far! Here are some more feature suggestions:
-
Graphical User interface. That would be easy to do with browser widgets in Python. Not sure how difficult in C++.
-
Token counter, or better: something that tells you how much money you spent today / this month / or a resettable dollar counter with time stamp when you last reset it. Probably needs a user defined config text file to put in, and update, the current cost structure.
-
adding GPT readable time-stamps to the vector embeddings that get retrieved with the memory. The goal here is to give chatGPT a temporal context so that it knows when those conversation snippets happened that it recalls. Currently it lacks temporal context, but temporal context can sometimes help. Probably it then also would need the current time stamp.
-
dynamically update the system message as a second source of more „general / abstract“ memory. That could update the style of the responses to your liking, or add some autobiographic information. The idea here would be to have a second LLM extract system context relevant information out of your conversations once in a while and restructure the system message accordingly.
Point 4) would solve the following memory issues: right now, only declarative memory can be retrieved, but other forms of memory get lost, like what are your overall goals, how you instruct it that it should talk to you.
I know that you can change the system message directly, but it would be best to really treat it as a second context memory that contains the things that it is always supposed to remember.
The only way i see to get this dynamic is by having a second LLM to extract the abstract context information out of your current conversation once in a while and update the system message accordingly.
That LLM could for example be instructed in the following way: „please parse the following text (last ten back and forth with the main LLM) for autobiographic information and update the current system message accordingly.“
This prompt of course would need to be refined and tested. But I think it should be doable.
The goal is to have something that remembers like a human would.
GUI: NOOOO I can't GUI! :'(
Token counter: Maybe in the future.
Time-stamps: Already have.
Dynamic system message: That seems like a lot of work, I'll think about it later.
Hello, is it possible to call upon a plugin in the CLI-bot when logged in with my API-key and using the GPT-4 model?
you know that using the api is not the same thing as chatgpt... right? those plugins are for chatgpt, not the api
you can kinda do it now, with the new functions thingy
Just wanted to say this is a cool little project. How are you using the storage/infinite memory - what's the implementation there? Just writing to text or using vector embeds?
I made a much less feature rich version of this so i think it's pretty cool. any plans to expand on its modempunk - ness?
As for GUI you can practically generate GUIs on the fly with most web stuff, windows api stuff is a bit messy though. C++ should extend easily enough into any languages
im storing the embeddings vectors into a json file
Heya, just a suggestion, is it possible for the app to have the models change on the fly (via dash commands for example)?
For example, when searching for ideas for a blog, we use the turbo 3.5 models, and then after we found the topics we want to write, we change to GPT-4 for more accuracy. Or vice versa.
In that way, one embeddings is used in two or more models, intertwined with each other.
I understand there will be limitations, like token cap difference between GPT-4 and GPT-3.5
But I just wondering if it is still inside that cap, this idea is possible.
Have you verified that switching engines like you describe works as expected in the playground? If it does not work there, it would be unlikely to work in this application as expected either.
yeah its possible, but the token limit is different🤔
Would the embeddings even be compatible between the two types?
ofc, embeddings are a separate layer
Although I love the project, have you thought about having your own interface using your code for this as backend? or is chatbot idea overdone? @hidden sequoia love the idea of use of embeddings
so basically using the system to create a chatbot api?
yeah, majority of chatbots I've seen that are reliable / uses techniques like embeddings w/ and chat have been sketchy key stealers
was just curious if you thought of it, not begging or tryna annoy you lol
i most likely won't make an api since this project already basically laid out most of the ideas and structures, so its beyond my scope to make an api.
once you get the idea of embeddings, it shouldn't be that hard to make a proper api out of it, but its hard for me because i dont have much experience in backend dev.😊
Say is it his an issue with the chatbot or OpenAI side of things, quite often when I ask a question the answer is empty and the buffer/cursor doesn't move. However when I press ctrl + c, and enter the question again the answer is immediately responded.
most likely openai
just wanna say thanks for the custom base url so we can test it with other llms
❤️
300 stars in github 😎
The entire sotfware is open source, and if you can follow C++, you see that the key is only used to talk to the API, and is never sent to anything else.
i wasn't worried about this project being a key stealer, i just mentioned other non-github applications
Somebody could, I imagine turn this chatbot into a key stealer, but they'd have to distribute as closed source.... although even then copyright notices would still need to exist so there'd be some accountability and you could trace it back to the original project
Hey!
I was thinking of using this and pasting a large code i have for debugging.
but i can't paste in the cmd because it would paste it line by line and run the chat with the bot with only one line.
is there away around this? :c
multiline pasting is supported, if its not working(pasting only a single line and just send it to the bot), make sure your terminal wasn't overwriting the ctrl+v keybind(the new windows cmd does this by default, and windows 11 uses it as the default cmd so you need to modify the settings).
also for your use case i recommend using the doc qa feature, you can find detailed instructions for it in the project wiki!
I kinda gave up that day on it, but i tried the doc qa, but it didn't work? i don't know if i did something wrong. i followed the steps and it created everything correctly i think but when i ask questions or anything about it, it says i need to provide the code (as if it's not reading the text file i guess).
maybe try enabling debug_reference and see what it pulled
how to install it
im assuming i'd be able to see something related to the file it's referencing here? but it doesn't do that. the file is converted to json so im assuming i followed the steps correctly as well. o.o
did you load it
yea ||(no i wasn't waiting infront of the chat for the answer)||
or maybe you need to read this: https://github.com/LagPixelLOL/ChatGPTCLIBot/wiki/Technical-Talks#weakness-of-embeddings
hmm, but would it not at least detect anything from the file?
i just need a way to interact with the api to try and fix/enhance the code honestly. im fine with copy and paste but it didn't work last time and it would copy one sentence at a time and then it interrupts it when it's answering the question.
I would put it in the prompt file, but i guess that would cost a lot of $ each time i ask something i guess? i'll think of something else xD
Thanks anyways!
Hello, I changed the config.json settings to use gpt-3.5-turbo-16k with my private API-key.
Then I used the CLI-BOT a bit and noticed at openAI account/api-keys the API-Key hasn't been active.
Will the key only be active when I use GPT-4?
openai update is slow, you need to wait a bit more
hi @hidden sequoia i'm not sure what settings i messed with but now i'm consistently getting:
API returned error: This is not a chat model and thus not supported in the v1/chat/completions endpoint. Did you mean to use v1/completions?
i've tried reinstalling and strangely it still remains
which model are you using
"model": "gpt-3.5-turbo",
huh/
it started after i changed it to gpt-4
yep
{
"api_base_url": "https://api.openai.com",
"api_key":
"ca_bundle_path": "/etc/ssl/certs/QuoVadis_Root_CA_2_G3.pem",
"debug_reference": false,
"frequency_penalty": 0.0,
"logit_bias": {},
"max_display_length": 100,
"max_reference_length": 4,
"max_short_memory_length": 4,
"max_tokens": 500,
"model": "gpt-3.5-turbo",
"presence_penalty": 0.6000000238418579,
"search_response": true,
"space_between_exchanges": false,
"temperature": 1.0,
"top_p": 1.0
}
this is from a fresh install
very weird
let me try another fresh install
yep its still the same
its like the error is persistent across installs
have u tried using gpt-4?
i changed the model to "gpt-4" and that started the error
gpt-3.5-turbo used to always work
this error started today
😂
maybe turning it off and on again
genuinely have no idea
It’s happened to me before that I find the one thread, and half way through reading it starts sounding familiar, so I scroll back and it was me posting the question way back when…
Wow, this post was made back in March, is there anything to it? An unlimited context memory would be extraordinary.
Also, by v2ray, a name I haven't seen in a while.
Many projects have an unlimited context memory, it's just a matter of maintaining a buffer or creating summaries at certain points, or embeddings
im working on other ai projects rn so it won't be maintained as often