An OSS clone of perplexity.ai / phind.com with convex! | Convex Community | Page 1

lunar spade Jan 30, 2024, 1:13 AM

#

Hello everyone 👋,

I used Convex as the backend during a hackathon held at AGI House last Saturday! It has been built with the following stack:

Utilizes Next.js 14.
Hosted by Vercel.
Backend and vector embedding-powered caching are on Convex.
Google Search is powered by serper.dev.

The demo can be found at https://bobtail.dev, and the code is available at https://github.com/wsxiaoys/bobtail.dev.

It should provide a comprehensive end-to-end overview of how an LLM + RAG system shall be built using Convex. Enjoy!

crisp stratus Jan 30, 2024, 1:24 AM

#

Hi @lunar spade welcome to the community and thanks for sharing!

fringe ether Jan 30, 2024, 8:35 AM

#

Wow this is really impressive. nice work @lunar spade. Would you want it considered for a template on convex.dev/templates ? We have a few AI rag chat examples but not a web-powered rag search engine, and you did a fantastic job with this one. Very clean code, especially given it was a hackathon.

One question - the cron to calculate embeddings for searches - why not do that in rag after doing updateContent ? At that point it won't be interfering with user perceived latency, and that way it'd be updated right away, instead of later on. Btw getSearchesWithoutQueryEmbeddings uses a filter which scans rows, so will fail after ~10k documents searches are in the table. You could use an index, but if you just write the embedding immediately I think this function can go away.

lunar spade Jan 30, 2024, 10:03 AM

#

Would you want it considered for a template on convex.dev/templates

To make sure it's suitable as a scaffold, it would probably be beneficial to refactor the front end to utilize shadcn/ui for its simplicity and minimalistic aesthetic, along with employing a copy-paste approach for quick modifications.
Although I might not be able to undertake this task in the near future, I will keep it in mind and consider it as an opportunity to learn more about shadcn/ui + tailwind

One question - the cron to calculate embeddings for searches - why not do that in rag after doing updateContent ? At that point it won't be interfering with user perceived latency, and that way it'd be updated right away, instead of later on. Btw getSearchesWithoutQueryEmbeddings uses a filter which scans rows, so will fail after ~10k documents searches are in the table. You could use an index, but if you just write the embedding immediately I think this function can go away.

It was added as back-filling logic during the hackathon. Right it would be cleaner to simply schedule the embedding when creating the query.

#An OSS clone of perplexity.ai / phind.com with convex!