#What is RAG
1 messages · Page 1 of 1 (latest)
Hi @stoic shuttle. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
RAG stands for Retrieval Augmented Generation. It's a technique used in AI models to examine the latest user input and the context of the conversation, and then use embeddings or other search techniques to fill the AI model context with specialized knowledge relevant to the topic. This is usually used for making an AI that can answer about closed-domain problems, such as a company knowledgebase.
The phrase Retrieval Augmented Generation (RAG) comes from a recent paper (https://arxiv.org/abs/2005.11401) by Lewis et al. from Facebook AI. The idea is to use a pre-trained language model (LM) to generate text, but to use a separate retrieval system to find relevant documents to condition the LM on.
In practice, RAG can be combined with tools like Elasticsearch or Qdrant to enhance the performance of AI models. It can also be fine-tuned for specific use-cases, and can be used in conjunction with Few-Shot Learning to boost the model's performance and reduce hallucinations.
Sources:
- Forum Post: Difference between langchain and rag > Official Answer (https://community.openai.com/t/528625)
- Examples > Fine-tuned_qa > Fine-Tuning for Retrieval Augmented Generation (RAG) with Qdrant (https://cookbook.openai.com/examples/fine-tuned_qa/ft_retrieval_augmented_generation_qdrant)
- Docs > Guides > Prompt engineering > Use external tools (https://platform.openai.com/docs/guides/prompt-engineering)
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@stoic shuttle kapa.ai is still learning and improving, please let us know how it did by reacting below
@raw compass What context window does the GPT-4 that is accessible at chat.openai.com use?
Let me check the knowledge base for you. One moment! :books:
The GPT-4 model that is accessible at chat.openai.com is used by ChatGPT Plus subscribers. This model is OpenAI's latest language model and has enhanced capabilities in advanced reasoning, complex instructions, and creativity. The usage of GPT-4 is dynamically adjusted based on demand to give every Plus subscriber a chance to try the model.
Source: Help Article > En > Articles > Announcing GPT-4 in ChatGPT (March 14, 2023) (https://help.openai.com/en/articles/6825453-chatgpt-release-notes#h_0eb2784e11)
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@cloud coral kapa.ai is still learning and improving, please let us know how it did by reacting below
@raw compass What context window does the GPT-4 that is accessible at chat.openai.com use?
Let me check the knowledge base for you. One moment! :books:
The GPT-4 models that are accessible via the OpenAI API, including the ones used at chat.openai.com, have a context window of 128,000 tokens. This means that the model can consider up to 128,000 tokens of conversation history when generating a response. Once the size of the messages in a conversation thread exceeds this context window, the oldest messages are dropped to fit within the limit.
However, it's important to note that the actual number of tokens you can use in a single API call may be limited by your usage tier. For example, a Usage Tier 3 API account is limited to 40,000 tokens per minute. Once you hit Usage Tier 4 by spending a lifetime total of $250 with OpenAI, your tokens per minute limit will be raised to 150,000 tokens per minute.
Sources:
- Docs > Models > GPT-4 and GPT-4 Turbo Part 1 (https://platform.openai.com/docs/models)
- Forum Post: Test new 128k window on gpt-4-1106-preview > Official Answer (https://community.openai.com/t/478424)
- Docs > Assistants > How Assistants work Beta > Managing Threads and Messages > Context window management (https://platform.openai.com/docs/assistants/how-it-works)