#Curie finetune keeps failing?
24 messages · Page 1 of 1 (latest)
You're going to explain a lot more if you want an answer. Show an example of your data, and what kind of result you're expecting
It's a simple generation model where the output is always the lines of "write me X in the style of Y" and the output is X written in the style of Y, ex: write me a poem in the style of Edgar Allan Poe, and each completion is a different poem of his.
I created 3 models earlier this week, 1 Curie and 2 davinci without any issues for the same usecase. The only thing I can think of that has changed is the file token size (from about 58k tokens to 250k tokens). I've stripped any completion that goes over the curie token limit. I've used completion stop tokens, whitespaces at the start, prompt end token.
Doubt the issue is with the file because the preprocessing was quite thorough, had to fix encoding issues and such. The files gets uploaded succesfully, but after a while when I check its status using the print(openai.FineTune.list()) method, and it always shows "Status: failed".
Out of curiosity, or just to confirm, these writing styles are unique for each fine tuned model, right?
No, the initial models were based on one "writing style". But this new one has up to 4. Say, EAP, Shakespeare, Victor Hugo and Hemingway. The first models were really just for me to learn how to upload and finetune a dataset.
Why not separate each model to its unique writing style?
Can you post a sample training prompt, expected result, and then the actual?
Without some serious speculation there's nothing more to say
Sure
Prompt: "Write me a chorus in the style of Eminem" Completion "My tea's gone cold I'm wondering why, got out of bed at all.
The morning rain clouds up my window, and I can't see at all.
Even if I could it would all be there etc etc... ### (end token)"
Actual results from the first models were egregious which is the reason for this updated model with more data to train on
I also didnt follow best practices in pre processing the data for those first 3 models
Training date, prompt completion pair
Dont have any results as the finetune wont process
I'm not too familiar on the fine tuning error handling. Strange that the CLI apparently has confirmed it, but then it fails. Have you tried splitting the data into parts?
It may also be nice to do and benchmark that way.
Also, what file format are you using initially?
I’m using.a jsonl utc-8 encoding. If I fine tune in parts, can I refine tune an already fine tuned model? Would it stack or reset?
It technically recreates it, but using the previous training data already learned
Only charged for the new data I believe
If you goto the fine tuning advanced usage you'll find more info
Ah, ill try that then thanks.