#fine tuning your own model can be a
1 messages ยท Page 1 of 1 (latest)
I suppose it can be a hassle, and I honestly don't even know what a "training framework" is. I created a JSONL file from the schema, breaking it down to JSON schema components. I currently have only ~100 items there, which I understand is very little. I just couldn't really get any sensible responses for anything but 1:1 matching prompts.
the train.py that you're using to train it?
(or whatever it is you have as your Python file to train it)
100 items is very little, you probably have overfitting
I would recommend adding a noise parameter
--noise 0.1
100 items will definitely overfit very quickly
I think I need to read quite a bit more as I barely understand what you're saying. I tried with different settings, but got pretty horrible results.
this is how one of my models is training
I should probably put that graph plotter into GitHub, that's my own code
but you can see the hyperparameters on the top of the left graph, third row.
I'm not sure if this answers your question, but I created this file with my own scripts, and then I used openai tools fine_tunes.prepare_data..., and other openai commands.
the horrible results are overfitting
So you have 843M entries in your training data, am I getting that right?
you said you got 100 samples, I'd say that's very little sample data, you can get overfitting very quickly
Ok, thanks, I will read more about overfitting
no, that set is 843MB of raw text, it's the set size in MB
Oh
my main set is 4.5GB of raw text
:------------D
and wasn't always a horse....
I gradually turned into one..... ๐ด
yeah but what you could do is add noise, if there's a parameter to add noise, that will seriously prevent overfitting
also, with 100 samples, you shouldn't run it for very long
I mean we could be talking about tens of iterations before it starts to overfit
there was that "one epoch is all you need" paper but idk about that
hmm, so it's a pip package from openai?
AAH! I see, you're using the API
Yeah
my bad, I thought you were talking about fine-tuning a local model
I'm following the openai docs. Sorry for not being clear about it
oh no worries, I thought you were fine-tuning a local model
https://pypi.org/project/openai/ <= you probably went through the checklist from there?
Not that exact document, but something similar, so that I have the openai running on my local and the API connection works.
I was able to create the training file, prepare it, and create the fine-tuned model, but when I use the model with the API or the openai playground, the results are horrible.
this can occur for a multitude of reasons
what kind of data are you working with?
is it like a Q&A set or ...? and what do you mean by "horrible"?
My training entries look like this {"prompt":"What is the schema definition for ComponentName component in XXX?","completion":" The definition of `ComponentName` component in XXX is\n\n\n{\n "type": "string",\n "enum": [\n "foo"\n ]\n}\n END"}
And them some more detailed questions about the schema
This is obviously one part that I'm struggling as I'm not even sure what kind of data should I give it
this is a bit of a workaround, but if you only need 100 Q&A pairs, you know that you could use i.e. pyautogui to ask those hundred questions from ChatGPT
whenever I need Q&A pairs, I just run a pyautogui script to copypaste questions in with a timer and then go grab the data dump in an hour, lol ๐
it's not the same though
I suppose I'm going to need quite a bit more. My assumption was that I would get something out of this data, but obviously I was kinda wrong.
Do you have some ideas how would you fine-tune GPT to know about a JSON schema?
but if 100 Q&A's is all you need ...
there's the step-by-step guide step 1
and here, since you already got the data; https://github.com/openai/openai-cookbook/blob/main/examples/fine-tuned_qa/olympics-2-create-qa.ipynb
Note: We have used temperature=0, but it may be beneficial to experiment with a higher temperature to get a higher diversity of questions.
100 Q&A answers can answer the highest level questions. Then each component would need additional Q&As to get into specifics of how something should be actually used.
"WARNING: This step will last a long time, and consume a lot of tokens, as it calls davinci-instruct for every section to generate a number of questions."
lol that's why I use pyautogui ๐
Thanks, I will take a look at that too. I need to very soon join a meeting.
try temperature=0
I will, thanks!
in the case of GPT3=> API, it's gotta be a temperature setting or some other hyperparameter, re-check formatting JIC etc.
if all else fails, revert to a pyautogui script ๐
๐ซก
I think I will. Thank you very much ๐ค
I'll let you know about my progress, if you're interested ๐
of course! I've been doing this stuff for years now and if I can help out anyone, I'll try to do my best. I don't use the OpenAI API too much since I have server-workstations training my own models locally, what you saw in that entropy loss graph thing I posted was the real deal in model training, not API-based finetuning
(the comparison being that you saw the dials on the outside of the machinery, while I work in the engine room ๐ kachunk kachunk... day an' nite....)
I have been doing this for like 3 days... ๐