Hello I am trying to figure out how to program in limited memory to a LLM model is previous data/token optimized and fed back into the new model how is this done in a technical sense? I am extremely close to making a Virtual model myself the last two challenges are further optimizing my STT and making the memory for the API I have everything else set up even a cute new vtuber model 😄
#How to Make Limited Memory For LLM
1 messages · Page 1 of 1 (latest)
This place is probably the poorest venue to get any answers, a lot of people theory-crafting only and script users, and no other actual devs with experience will answer you. The easiest is to use a library such as langchain/langflow:
https://python.langchain.com/docs/use_cases/chatbots
https://github.com/logspace-ai/langflow
Try joining actual dev groups if you want to learn proper stuff.
extract input embeddings, store into vector db, llm to summarize and append to system prompt
I know I’m not a dev but I saw this video on YouTube and didn’t know if it applies https://youtu.be/QQ2QOPWZKVc?si=68Q5e-0zJDQKkmyn
In this video, we look at MemGPT, a new way to give AI unlimited memory/context windows, breaking the limitation of highly restrictive context sizes. We first review the research paper, then I show you how to install MemGPT, and then we have special guests!
Enjoy :)
Become a Patron 🔥 - https://patreon.com/MatthewBerman
Join the Discord 💬 - ht...