Hey guys, I built RAGus - a tool that solves the "garbage in, garbage out" problem for RAG.
The problem: Most RAG implementations fail not because of the LLM, but because of bad chunking and poor retrieval. You scrape a website, split at 500 tokens, and suddenly your chunks have half a phone number or a question separated from its answer.
What RAGus does:
🌐 Sitemap Scraping - Scrapes entire websites via sitemap, cleans out nav/footer junk
🧠 Agentic Chunking - Uses an LLM to decide topic boundaries (not arbitrary token limits)
✨ Metadata Enrichment - Generates summaries + pre-generated questions for each chunk (solves vocabulary mismatch)
📊 AI Analytics - Analyzes chatbot conversations: sentiment, dropoffs, knowledge gaps, frustration detection
⏰ Auto-Sync - Scheduler keeps KB fresh when source content changes
Integrations:
🟢 OpenAI Vector Stores (via Assistants API)
🔷 Voiceflow KB
🟣 Qdrant
Built with FastAPI + React. We've seen retrieval accuracy go from ~40% to 90%+ just by adding the metadata layer.
Happy to give access if anyone wants to try it. Also open to adding more integrations (Pinecone, Weaviate, etc).
https://ragus.ai