#How to ger RAG working?

16 messages · Page 1 of 1 (latest)

river sky
#

Hello. I’m trying to get RAG to work. I use the llama3 405b as model with LibreChat as inference/chat GUI.

It doesn’t seem that openrouter API supports embeddings (which I think may be required for RAG to work?)

So if anyone got it working or know how to get it to work please tell me. I want to upload pdfs and have them summarized.

rocky dragon
#

Can you attach a screenshot of the RAG UI? Can you use openai for embeddings?

river sky
#

RAG UI? I just try to upload a pdf and it fails. I have added the RAG configuration to my .env file in LibreChat to use OpenRouter API and api key.

#

It seems there is an “integrations” tab under settings on the openrouter page which might help me get a non-OR api key. Is it something you can activate for me?

rocky dragon
#

Integrations are primary for adding other LLM api keys to your openrouter account

#

so they would not help you get a non-OR api key. what's the exact error message? screenshots help a lot

vital socket
#

He could use there OpenAI key for the RAG embedding, it is supported.

proven geode
vital socket
proven geode
#

(you need to use different models for image similarity/embeddings)

vital socket
#

I'll read it. By the way I don't know if the current RAG model is here to stay. It's kind of inefficient and frustrating for a lot of requests.

https://x.com/emollick/status/1818416020161544368?t=P4Fjq5ZDN4liihNgYpBQWg&s=19

A common cause of error when people use LLMs for serious work is that they don’t know what is in the context window. They assume when they upload a PDF the AI can read it. Instead, either it fails to parse or is only partially read, causing hallucinations

The UX should be better

proven geode
vital socket
#

Yes. If I would deploy RAG for clients I would think hard how to deploy some kind of long term resilient architecture, thinking how to not loose their data if I decide to change some part of the RAG architecture.