#An OSS clone of perplexity.ai / phind.com with convex!

1 messages ยท Page 1 of 1 (latest)

lunar spade
#

Hello everyone ๐Ÿ‘‹,

I used Convex as the backend during a hackathon held at AGI House last Saturday! It has been built with the following stack:

  • Utilizes Next.js 14.
  • Hosted by Vercel.
  • Backend and vector embedding-powered caching are on Convex.
  • Google Search is powered by serper.dev.

The demo can be found at https://bobtail.dev, and the code is available at https://github.com/wsxiaoys/bobtail.dev.

It should provide a comprehensive end-to-end overview of how an LLM + RAG system shall be built using Convex. Enjoy!

crisp stratus
#

Hi @lunar spade welcome to the community and thanks for sharing!

fringe ether
#

Wow this is really impressive. nice work @lunar spade. Would you want it considered for a template on convex.dev/templates ? We have a few AI rag chat examples but not a web-powered rag search engine, and you did a fantastic job with this one. Very clean code, especially given it was a hackathon.

One question - the cron to calculate embeddings for searches - why not do that in rag after doing updateContent ? At that point it won't be interfering with user perceived latency, and that way it'd be updated right away, instead of later on. Btw getSearchesWithoutQueryEmbeddings uses a filter which scans rows, so will fail after ~10k documents searches are in the table. You could use an index, but if you just write the embedding immediately I think this function can go away.

lunar spade
#

Would you want it considered for a template on convex.dev/templates

To make sure it's suitable as a scaffold, it would probably be beneficial to refactor the front end to utilize shadcn/ui for its simplicity and minimalistic aesthetic, along with employing a copy-paste approach for quick modifications.
Although I might not be able to undertake this task in the near future, I will keep it in mind and consider it as an opportunity to learn more about shadcn/ui + tailwind

One question - the cron to calculate embeddings for searches - why not do that in rag after doing updateContent ? At that point it won't be interfering with user perceived latency, and that way it'd be updated right away, instead of later on. Btw getSearchesWithoutQueryEmbeddings uses a filter which scans rows, so will fail after ~10k documents searches are in the table. You could use an index, but if you just write the embedding immediately I think this function can go away.

It was added as back-filling logic during the hackathon. Right it would be cleaner to simply schedule the embedding when creating the query.