#fine tuning your own model can be a

1 messages ยท Page 1 of 1 (latest)

rancid snow
#

I suppose it can be a hassle, and I honestly don't even know what a "training framework" is. I created a JSONL file from the schema, breaking it down to JSON schema components. I currently have only ~100 items there, which I understand is very little. I just couldn't really get any sensible responses for anything but 1:1 matching prompts.

tranquil star
#

the train.py that you're using to train it?

#

(or whatever it is you have as your Python file to train it)

tranquil star
#

I would recommend adding a noise parameter

#

--noise 0.1

#

100 items will definitely overfit very quickly

rancid snow
#

I think I need to read quite a bit more as I barely understand what you're saying. I tried with different settings, but got pretty horrible results.

tranquil star
#

this is how one of my models is training

#

I should probably put that graph plotter into GitHub, that's my own code

#

but you can see the hyperparameters on the top of the left graph, third row.

rancid snow
#

I'm not sure if this answers your question, but I created this file with my own scripts, and then I used openai tools fine_tunes.prepare_data..., and other openai commands.

tranquil star
#

the horrible results are overfitting

rancid snow
#

So you have 843M entries in your training data, am I getting that right?

tranquil star
#

you said you got 100 samples, I'd say that's very little sample data, you can get overfitting very quickly

rancid snow
#

Ok, thanks, I will read more about overfitting

tranquil star
rancid snow
#

Oh

tranquil star
#

my main set is 4.5GB of raw text

#

:------------D

#

and wasn't always a horse....

#

I gradually turned into one..... ๐Ÿด

#

yeah but what you could do is add noise, if there's a parameter to add noise, that will seriously prevent overfitting

#

also, with 100 samples, you shouldn't run it for very long

#

I mean we could be talking about tens of iterations before it starts to overfit

#

there was that "one epoch is all you need" paper but idk about that

tranquil star
#

AAH! I see, you're using the API

rancid snow
#

Yeah

tranquil star
#

my bad, I thought you were talking about fine-tuning a local model

rancid snow
#

I'm following the openai docs. Sorry for not being clear about it

tranquil star
#

oh no worries, I thought you were fine-tuning a local model

rancid snow
#

Not that exact document, but something similar, so that I have the openai running on my local and the API connection works.

#

I was able to create the training file, prepare it, and create the fine-tuned model, but when I use the model with the API or the openai playground, the results are horrible.

tranquil star
#

this can occur for a multitude of reasons

#

what kind of data are you working with?

#

is it like a Q&A set or ...? and what do you mean by "horrible"?

rancid snow
#

My training entries look like this {"prompt":"What is the schema definition for ComponentName component in XXX?","completion":" The definition of `ComponentName` component in XXX is\n\n\n{\n "type": "string",\n "enum": [\n "foo"\n ]\n}\n END"}

And them some more detailed questions about the schema

#

This is obviously one part that I'm struggling as I'm not even sure what kind of data should I give it

tranquil star
#

this is a bit of a workaround, but if you only need 100 Q&A pairs, you know that you could use i.e. pyautogui to ask those hundred questions from ChatGPT

#

whenever I need Q&A pairs, I just run a pyautogui script to copypaste questions in with a timer and then go grab the data dump in an hour, lol ๐Ÿ˜‚

#

it's not the same though

rancid snow
#

I suppose I'm going to need quite a bit more. My assumption was that I would get something out of this data, but obviously I was kinda wrong.

Do you have some ideas how would you fine-tune GPT to know about a JSON schema?

tranquil star
#

but if 100 Q&A's is all you need ...

#

there's the step-by-step guide step 1

#

Note: We have used temperature=0, but it may be beneficial to experiment with a higher temperature to get a higher diversity of questions.

rancid snow
#

100 Q&A answers can answer the highest level questions. Then each component would need additional Q&As to get into specifics of how something should be actually used.

tranquil star
#

"WARNING: This step will last a long time, and consume a lot of tokens, as it calls davinci-instruct for every section to generate a number of questions."

lol that's why I use pyautogui ๐Ÿ˜‰

rancid snow
#

Thanks, I will take a look at that too. I need to very soon join a meeting.

tranquil star
#

try temperature=0

rancid snow
#

I will, thanks!

tranquil star
#

in the case of GPT3=> API, it's gotta be a temperature setting or some other hyperparameter, re-check formatting JIC etc.

#

if all else fails, revert to a pyautogui script ๐Ÿ˜„

#

๐Ÿซก

rancid snow
#

I think I will. Thank you very much ๐Ÿค—

#

I'll let you know about my progress, if you're interested ๐Ÿ˜Š

tranquil star
#

of course! I've been doing this stuff for years now and if I can help out anyone, I'll try to do my best. I don't use the OpenAI API too much since I have server-workstations training my own models locally, what you saw in that entropy loss graph thing I posted was the real deal in model training, not API-based finetuning

#

(the comparison being that you saw the dials on the outside of the machinery, while I work in the engine room ๐Ÿ˜‚ kachunk kachunk... day an' nite....)

rancid snow
#

I have been doing this for like 3 days... ๐Ÿ˜„