#Large Unstructured Dataset
14 messages · Page 1 of 1 (latest)
Making a thread so this conversation doesn't clog up the chat 🙂
Firstly, I am not an expert in any of this by any measure. Just another person looking to leverage this to make certain tasks easier for myself and others or enable use cases that weren't possible before.
But, if you are going to use GPT-3, it sounds like you need a fine tune model.
With a fine tune model you can provide it lots of data and teach it what good responses are to specific prompts. The data you feed a fine tune is in the format:
{"prompt": "[insert prompt here]"}, {"completion":"[insert completion here]"}
Large Dataset
Large Unstructured Dataset
I'm not sure what the best method would be to provide it all of the information it needs to adequately address your need.
With that said, based one what you've shared there is a youtube video that may be instructional for you. I personally haven't watched it so cannot vouch for it but it has come up on my page a lot:
3. OpenAI API Python - Earnings Call Summarization by "Part Time Larry"
Seems like for a large data set you need to use embedding
Fine tuning is mostly used to adjust the way questions are answered
As I understand it fine tuning is there to take the vast unstructured language understanding that an LLM has and focus it in on a specific set of data.
I'm not sure what you mean by "you need to use embedding". I'm still new to this technology and don't understand what you've said, will you please explain what you mean?
I’m still learning about it myself but its a feature called embedding
Trying to figure it out myself rn