Hey. Please let me know if I understand embeddings correctly. I managed to go through the openai tutorial crawling my own website and it seems to work. Now I want to make that Jupyter notebook into a web app (flask). Will the entire script have to be run with each use? Or do I have to process the first part only once and users can just ask upon the already processed data? If so, with everything living on my server, I can just serve the answer_question function to my app? Any lead as to how to go about it would make this total noob immensely grateful.
#Website Q&A with embeddings tutorial question
17 messages · Page 1 of 1 (latest)
Stack Overflow
I'm using customized text with 'Prompt' and 'Completion' to train new model.
Here's the tutorial I used to create customized model from my data:
beta.openai.com/docs/guides/fine-tuning/advanced-usage
You can use this to start
this is great. very insightful. do I have to use cosine similarity if I'm using pinecone database? I was testing earlier their new beta node js release.
Pinecone is indeed more powerful for semantic search.
But I haven’t tried it yet. You can ask Kaveen, he knows that very well
thanks
@uncut minnow hoping for your expertise on this one 🙏
but a database on your local computer is free lol
there's a free tier on pinecone but you are limited to 1 index only. I've tried it last week and upserted 6k vectors. Took quite some time to complete though lol
I think by default the similarity metric being used by pinecone is cosine similarity if Im not mistaken. I was just scrolling through their documentation and saw this.
got it, thanks
Thank you so much for the very helpful tutorial. I followed it without issue. However, it felt kind of limited, and not more than a fancy Vertical look up in Excel (unless I'm wrong). And so I have two follow up questions: 1. I tried to apply the knowledge learned in your walkthrough with openai's tutorial (https://platform.openai.com/docs/tutorials/web-qa-embeddings). They use a function "create_context" that seems a bit different from yours. Are those two different approaches? 2. In either case, is it possible to embed all columns in a csv and not just one? So that along with "Tell me something about company ABC", I can also ask something like "Which company has that fact?" Thanks again and in advance.
An API for accessing new AI models developed by OpenAI