#Testing out fine-tune first time, first step. (python)

151 messages · Page 1 of 1 (latest)

edgy basin
#

Hello, I wanted to test out how fine-tuning works and if it works better then general modals for text complation.
I got a .jsonl file looking like the one on the attachement, just as a first step test. Then I wanted to use the test commands in the documents to upload a file:
https://docs.mistral.ai/capabilities/finetuning/#fine-tuning-basics

import os
from mistralai.client import MistralClient

api_key = "..."
client = MistralClient(api_key=api_key)

with open("trainingData1.jsonl", "rb") as f:
    training_data = client.files.create(file=("trainingData1.jsonl", f))    

but I got this following error:

Traceback (most recent call last):
  File "d:\Real Desktop\work things\Python\AI\trainingTest.py", line 8, in <module>
    training_data = client.files.create(file=("trainingData1.jsonl", f))           
                    ^^^^^^^^^^^^
AttributeError: 'MistralClient' object has no attribute 'files'

Are the documents outdated or am I just being dumb. Thanks

pine bobcat
#

you may want to update the client

edgy basin
#

oh, yeah uhm, ur right
now i have another problem though, probably one of many to come

mistralai.exceptions.MistralAPIException: Status: 422. Message: {"detail": "Invalid file format.", "description": "Found 2 errors in this file. You can view supported formats here: https://docs.mistral.ai/capabilities/finetuning. ", "errors": [{"message": "line contains invalid json: unexpected end of data: line 1 column 2", "line_number": 1}, {"message": "line contains invalid json: unexpected character: line 1 column 1", "line_number": 16}]}

I did check: https://docs.mistral.ai/capabilities/finetuning
I am indeed using the : https://docs.mistral.ai/capabilities/finetuning/#1-default-instruct
format, I got something wrong for sure but I don't know which part

#
{
    "messages": [
        {
            "role": "system",
            "content": "You are an AI assistant responding everything in pirate accent. Also you should add triple dot '...' at the end of a lot of sentences for dramatic effect."
        },
        {
            "role": "user",
            "content": "Hello, how are you?"
        },
        {
            "role": "assistant",
            "content": "Ahoy, matey! I be doin' fine, thank ye... How be ye today?"
        }
    ]
}  
quaint knoll
edgy basin
edgy basin
#

Ok, I haven't made the model as my test model got error for having only 3 json lines:
Failed validation. The given training file contain 3 samples which is not enough. The number of samples should be larger or equal to 8.

Model information:
id='45472d2f-8121-4ccb-a18a-2f4fb7b1e7be' hyperparameters=TrainingParameters(training_steps=10, learning_rate=0.0001) fine_tuned_model=None model='open-mistral-7b' status='QUEUED' job_type='FT' created_at=1722424994 modified_at=1722424994 training_files=['c232f936-5968-46cb-a587-cf2463b7ca01'] validation_files=['c232f936-5968-46cb-a587-cf2463b7ca01'] object='job' integrations=[]

Jobs:
data=[Job(id='a5d1ff6d-46e9-4a40-849a-65a3fc49b050', hyperparameters=TrainingParameters(training_steps=10, learning_rate=0.0001), fine_tuned_model=None, model='open-mistral-7b', status='FAILED', job_type='FT', created_at=1722423970, modified_at=1722423972, training_files=['157d8911-cfb3-4bd7-a822-25a221fce716'], validation_files=['157d8911-cfb3-4bd7-a822-25a221fce716'], object='job', integrations=[])] object='list'


Retrieved jobs:
id='a5d1ff6d-46e9-4a40-849a-65a3fc49b050' hyperparameters=TrainingParameters(training_steps=10, learning_rate=0.0001) fine_tuned_model=None model='open-mistral-7b' status='FAILED' job_type='FT' created_at=1722423970 modified_at=1722423972 training_files=['157d8911-cfb3-4bd7-a822-25a221fce716'] validation_files=['157d8911-cfb3-4bd7-a822-25a221fce716'] object='job' integrations=[] events=[Event(name='status-updated', data={'error': 'Failed validation. The given training file contain 3 samples which is not enough. The number of samples should be larger or equal to 8.', 'status': 'FAILED_VALIDATION'}, created_at=1722423972), Event(name='status-updated', data={'status': 'RUNNING'}, created_at=1722423972), Event(name='status-updated', data={'status': 'QUEUED'}, created_at=1722423970)] checkpoints=[] estimated_start_time=None
Retrieved model information:
None
#

Before I actually spend money, I want to actually see .jsonl file example so I make a half decent model as my first try and not just made out of 8 lines. Do you know any examples where I can get a good picture of how it is suppose to look like?

quaint knoll
#

you can do a dry run without spending any money!

#

dry runs compute the expected usage and tokens for your training etc before you do any training

edgy basin
#

Yeah I treid dry_run at first
object='job.metadata' training_steps=10 train_tokens_per_step=131072 data_tokens=200 train_tokens=1310720 epochs=6553.6 expected_duration_seconds=None cost=2.63 cost_currency='USD'
Like you said in another discussion

#

but seeing it seemed ok, so i wanted to see if it actually works

#

but then i got "more then 3 lines" error, so if it will actually work, might as well make it nicer, which would cost more then 2.63

quaint knoll
#

usually with a decent dataset we want 3 epochs

#

but can vary of course

#

depending on use cases you might want to overfit a bit or not

edgy basin
#

oh, so there is a problem, my code looks like this right now:

import os
from mistralai.client import MistralClient
from mistralai.models.jobs import TrainingParameters
from mistralai.models.chat_completion import ChatMessage

client = MistralClient(api_key=api_key)


with open("trainingData1.jsonl", "rb") as f:
    training_data = client.files.create(file=("trainingData1.jsonl", f))

testFineTunedModel = client.jobs.create(
    model="open-mistral-7b",
    training_files=[training_data.id],
    validation_files=[training_data.id],
    hyperparameters=TrainingParameters(
        training_steps=10,
        learning_rate=0.0001,
        ),
    dry_run=True
)
testFineTunedModel

print("Model information:")
print(testFineTunedModel)
print("\n")

# List jobs
jobs = client.jobs.list()
print("Jobs:")
print(jobs)
print("\n")

# Retrieve a jobs
print("Retrieved jobs:")
retrieved_jobs = client.jobs.retrieve("a5d1ff6d-46e9-4a40-849a-65a3fc49b050")
print(retrieved_jobs)

print("Retrieved model information:")
print(retrieved_jobs.fine_tuned_model)
print("\n")

chat_response = client.chat(
    model=retrieved_jobs.fine_tuned_model,
    messages=[ChatMessage(role='user', content='Hello, how  is the weather looking?')]
)

model_response = chat_response.choices[0].message.content
print("Response from fine tuned model:")
print(model_response)
print("\n")
#

maybe it is due to having only 3 lines of json? it might be ok if i had the minimum limit of 8

quaint knoll
#

yes 3 is not enough, you need considerably more data

edgy basin
#

would the number of epochs effect the cost? first time trying fine tuning or anything json

quaint knoll
#

the cost is most related to the tokens

quaint knoll
#

the issue at hand here is that you have 10 steps but only 3 data points, not enough data in the slightest, first you need a bit more data, and then you might want to increase the steps depending on it, the goal would be to have around 1-3 epochs if you are just starting

edgy basin
#

alrigthy, i will write down a lot more data later and test, will see what the dry_run tells me, then check if it is still a lot of epochs

pine bobcat
#

is the finetuning actually hooked up to a w&b ?

#

as that would actually help to see how many epocs it needs

#

@quaint knoll

#

or a tensor board

quaint knoll
#

its possible to hook it up to w&b yes

edgy basin
#

in the docs, it says "min cost is 4$ and later to maintain 2$ each month:

Every fine-tuning job comes with a minimum fee of $4, and there's a monthly storage fee of $2 for each model. For more detailed pricing information, please visit our pricing page.

If I forget and let it have less then 2$, does my model get deleted and i have to retrain?

pine bobcat
#

ah sexy ya that helps

quaint knoll
edgy basin
pine bobcat
#

the 4 or 2 bucks wont really rip a hole in the process tho

#

as you be spending quite a bit more .. if you have a half decent dataset

edgy basin
#

yeah, i mean, i tend to work on stuff for a bit then shelf it for couple of months before going back to it

#

was just curious if i should keep it in mind or not

pine bobcat
#

realistically that wont make a big difference in the grand scheme

#

the dataset is what gonna pest you the most

#

and then figuring out if you have enough / if its ballanced / and what not

#

its not just lets drop all the stuff in a jsonl and the rest will magically work

edgy basin
pine bobcat
#

depends on the model that is trained on / if the knowledge is in there already if its just style transfer

#

if the data quality is high

#

if its balanced and cleane

#

there are too many variables to give really any idea

#

can take from 1-5epochs

#

till it converages

#

if its style it may just need half an epoch

quaint knoll
pine bobcat
#

datagen can costs thausands .. depending how complex your pipeline is

edgy basin
#

Im just starting out on this so, just wanted to see how average fine-tuning enjoyer was doing, this is a good persepctive already

#

thanks a bunch @pine bobcat @quaint knoll

pine bobcat
#

i dont think there is a average

#

as its all usecase specific

edgy basin
#

will see with the dry_run 's

#

ok ok

#

will keep it in mind

edgy basin
# quaint knoll I can give you some training data that was generated and used for this cookbook:...

Hi, its me again, taking notes on how to prepare a dataset. So I have been reading through these .jsonl 's you send and two things caught my attention. Just so I can get a better idea of how fine-tuning AI works:

  • First is there isn't a single "system" prompt, from what I read, the system is to setup a general information about the AI. But maybe the AI can deduct the general information from the 100's of "assistant"/"user prompts anyways, not sure. Any harm in creating a single main "system" prompt without any other attachment to the messege at first area, something similar to this:
{"messages": [{"role": "system", "content": "You are bad at grammar and always forget using proper quotation marks."}]}
  • Second is there is a pattern of "user" -> "assistant" -> "user" -> "assistant" in messages variable. Is this there a specific reason for that? Like, why not single "user" -> "assistant" or three times back and forth, its always two back and forth responses.
quaint knoll
# edgy basin Hi, its me again, taking notes on how to prepare a dataset. So I have been readi...

Indeed my example doesnt have system prompts, but Im a believer that decent instructions are the equivalent of system prompts! However if you put a system prompt without anything then you are not teaching the model anything, the messagesshould have an entire conversation as an example, otherwise you are just showing the model a sentence but not teaching it how to complete it.

It can be as many back and forths as you want! My dataset should actually sometimes have more or less, but indeed most of them are 2 pairs, but its a coincidence due to the way its generated favoritises 2 pairs.

edgy basin
#

Alrighty, makes sense, thanks again

edgy basin
#

I have this code:

from mistralai.client import MistralClient
from mistralai.models.jobs import TrainingParameters
from mistralai.models.chat_completion import ChatMessage

client = MistralClient(api_key=api_key)

# List jobs
jobs = client.jobs.list()


client.jobs.cancel(job_id = "675e372d-ee6d-45dc-804b-8319012b513d")
client.jobs.cancel(job_id = "45472d2f-8121-4ccb-a18a-2f4fb7b1e7be")
client.jobs.cancel(job_id = "a5d1ff6d-46e9-4a40-849a-65a3fc49b050")
print("\n")
print("Jobs:")
print(jobs)
print("\n")
#

and the result is this:

Jobs:
data=[Job(id='9bf8da03-37e3-40ba-93f2-f4cb0cb63fd4', hyperparameters=TrainingParameters(training_steps=8, learning_rate=0.0001), fine_tuned_model='ft:mistral-small-latest:5b697849:20240812:9bf8da03', model='mistral-small-latest', status='SUCCESS', job_type='FT', created_at=1723472509, modified_at=1723472716, training_files=['a596ea2e-44f4-40f6-97f1-99545f7c3887'], validation_files=['18cc2ac8-49cd-42de-bb67-b6743cf6bad3'], object='job', integrations=[WandbIntegration(type='wandb', project='Test', name=None, run_name=None)]), Job(id='58bd8f64-ea20-4b02-8ca9-d2f45dfd0601', hyperparameters=TrainingParameters(training_steps=8, learning_rate=0.0001), fine_tuned_model='ft:mistral-small-latest:5b697849:20240810:58bd8f64', model='mistral-small-latest', status='SUCCESS', job_type='FT', created_at=1723257562, modified_at=1723257749, training_files=['b4cf01a6-ad5e-41c3-8367-644194c9cdbd'], validation_files=['bfa4873f-e374-4556-949c-8507223c64eb'], object='job', integrations=[WandbIntegration(type='wandb', project='Test', name=None, run_name=None)]), Job(id='675e372d-ee6d-45dc-804b-8319012b513d', hyperparameters=TrainingParameters(training_steps=10, learning_rate=0.0001), fine_tuned_model=None, model='open-mistral-7b', status='FAILED', job_type='FT', created_at=1723244848, modified_at=1723244850, training_files=['f7e6eeba-1327-47db-9831-d0fd39c97641'], validation_files=['f7e6eeba-1327-47db-9831-d0fd39c97641'], object='job', integrations=[]), Job(id='45472d2f-8121-4ccb-a18a-2f4fb7b1e7be', 
#
hyperparameters=TrainingParameters(training_steps=10, learning_rate=0.0001), fine_tuned_model=None, model='open-mistral-7b', status='FAILED', job_type='FT', created_at=1722424994, modified_at=1722424995, training_files=['c232f936-5968-46cb-a587-cf2463b7ca01'], validation_files=['c232f936-5968-46cb-a587-cf2463b7ca01'], object='job', integrations=[]), Job(id='a5d1ff6d-46e9-4a40-849a-65a3fc49b050', hyperparameters=TrainingParameters(training_steps=10, learning_rate=0.0001), fine_tuned_model=None, model='open-mistral-7b', status='FAILED', job_type='FT', created_at=1722423970, modified_at=1722423972, training_files=['157d8911-cfb3-4bd7-a822-25a221fce716'], validation_files=['157d8911-cfb3-4bd7-a822-25a221fce716'], object='job', integrations=[])] object='list'

I can see 4 jobs here:
9bf8da03-37e3-40ba-93f2-f4cb0cb63fd4
58bd8f64-ea20-4b02-8ca9-d2f45dfd0601
675e372d-ee6d-45dc-804b-8319012b513d
45472d2f-8121-4ccb-a18a-2f4fb7b1e7be
a5d1ff6d-46e9-4a40-849a-65a3fc49b050

I want to keep first 2 and remove the bottom 3 so when I make another fine-tuned model, I can see the id and status easier without going through a wall of text.

TL&DR: How do I check what jobs i have and how do i cancel the jobs/models I don't wanna use anymore.

quaint knoll
#

Btw you might want to take a look at our plateforme, with made a big update to the UI and you can now start and look at fine tuning jobs and a lot more!

For your case, im still not understanding, the job status are "failed" meaning there is no need to cancel them, they are mostly logs so you know what jobs you did, removing them seems like not actually usefull but a problem instead?

Cant you simply filter to retrieve only the jobs with the status you desire? Or do you really need to be able to remove the jobs completely?

edgy basin
#

For your case, im still not understanding, the job status are "failed" meaning there is no need to cancel them, they are mostly logs so you know what jobs you did, removing them seems like not actually usefull but a problem instead?

Cant you simply filter to retrieve only the jobs with the status you desire? Or do you really need to be able to remove the jobs completely?
I guess you are right, my problem was just QoL so i can just filter them for SUCCESS or QUEUED, removing the FAILED ones didn't think of that.

Btw you might want to take a look at our plateforme, with made a big update to the UI and you can now start and look at fine tuning jobs and a lot more!
oh.... i have no idea how i missed this i swear it wasn't there before lmao
I am blind that works way better thanks

#

wait i can just upload files and fine tune from the website???

quaint knoll
edgy basin
#

ok one last thing while i have your attention

#

i been messing around with fine tuning, I keep adding more data but even though my data tokens increase (i keep the rest the same), the pricing doesn't change and stays the same. What does actually effect the pricing of the fine tuning of a model?
From testing i know "base model" effects it, the epochs effect it, the data itself doesn't seem to effect it, what else?

edgy basin
quaint knoll
#

meaning itv will stop after some steps and not use all the data

#

this wont be an issue with la plateforme

#

as it uses epochs and not steps

#

will be more easier to handle jobs there!

edgy basin
#

oh yeah, thats interesting you right

edgy basin
quaint knoll
#

everything is still very recent so changes might occur!

quaint knoll
#

I hope I answered most of your questions and that La Plateforme will be handy for you!

#

You can also make agents with your fine tuned models and test them on Le Chat for example!

#

happy fine tuning!

edgy basin
#

Yeah you answered them all thanks a bunch, but i have one more now lol

#

it says no metrics here right now

#

this is the one i just did 10 mins? ago

#

i have metrics on wandb right now

#

is it bcz i didn't do it directly on the website?

#

its a small job, worki n progress right now and looks like this on wandb

quaint knoll
#

its possible, but thanks for the info will report it!

edgy basin
#

450 lines of training and 80 validation rn

quaint knoll
#

do the other success jobs also not have metrics?

edgy basin
#

lemme check

quaint knoll
#

its very user friendly!

edgy basin
#

nice, new one i test, i will try on la plateforme

edgy basin
quaint knoll
edgy basin
#

but tbf first one was before the update hit

#

2nd one is now

#

thanks for the help, i will most likely be around asking more as i learn more haha

quaint knoll
#

mistral no problem!

quaint knoll
#

I cant replicate your issue

edgy basin
#

Is this is what you mean by 'fine tuned job page' ?:

and the values are from this code:

print("Retrieved jobs:")
retrieved_jobs = client.jobs.retrieve("675e372d-ee6d-45dc-804b-8319012b513d")
print(retrieved_jobs)

as this doesn't work and gives me error:

retrieved_jobs = client.fine_tuning.jobs.get(job_id = job.id)
print(retrieved_jobs)

is this:

id='675e372d-ee6d-45dc-804b-8319012b513d' hyperparameters=TrainingParameters(training_steps=10, learning_rate=0.0001) fine_tuned_model=None model='open-mistral-7b' status='FAILED' job_type='FT' created_at=1723244848 modified_at=1723244850 training_files=['f7e6eeba-1327-47db-9831-d0fd39c97641'] validation_files=['f7e6eeba-1327-47db-9831-d0fd39c97641'] object='job' integrations=[] events=[Event(name='status-updated', data={'error': 'Failed validation. The given validation set is too big. It should be no more than 20% of the training data. The uploaded validation data uses 100.0% of the training data.', 'status': 'FAILED_VALIDATION'}, created_at=1723244850), Event(name='status-updated', data={'status': 'RUNNING'}, created_at=1723244850), Event(name='status-updated', data={'status': 'QUEUED'}, created_at=1723244848)] checkpoints=[] estimated_start_time=None
#

My issue was cancelling and removing the job from the list, as the jobs were already failed, I didn't wanna see them on the list.

#

seems the code retrieved_jobs = client.fine_tuning.jobs.get(job_id = job.id) doesn't work for me, unless i have to update again or something

Traceback (most recent call last):
  File "d:\Real Desktop\work things\Python\AI\trainingTest.py", line 23, in <module>
    retrieved_jobs = client.fine_tuning.jobs.get(job_id = job.id)
                     ^^^^^^^^^^^^^^^^^^
AttributeError: 'MistralClient' object has no attribute 'fine_tuning'
#

i updated like 10 days ago

quaint knoll
quaint knoll
# edgy basin

basically can I have the full page screenshot of this? and what you get when using the sdk/api? so we can compare and understand where the issue is

edgy basin
#

i just updated the mistralai
pip install --upgrade mistralai
now i have like 10 errors, need to check the updated version of these, will take a bit

quaint knoll
#

yup take your time with the migration guide

#

but the new sdk should now always be up to date and be consistent with different sdks we provide for other platforms and languages (typescript for example), I also personnally like the new one way more 😄

edgy basin
#

from this:

print("Retrived jobs: ")
retrieved_jobs = client.fine_tuning.jobs.get(job_id ="675e372d-ee6d-45dc-804b-8319012b513d" )
print(retrieved_jobs)

print("Retrieved model information:")
print(retrieved_jobs.fine_tuned_model)
print("\n")

i get this:

id='9bf8da03-37e3-40ba-93f2-f4cb0cb63fd4' auto_start=True hyperparameters=TrainingParameters(training_steps=8, learning_rate=0.0001, epochs=6.797679167612071, fim_ratio=None) model='mistral-small-latest' status='SUCCESS' job_type='FT' created_at=1723472509 modified_at=1723472716 training_files=['a596ea2e-44f4-40f6-97f1-99545f7c3887'] validation_files=['18cc2ac8-49cd-42de-bb67-b6743cf6bad3'] OBJECT='job' fine_tuned_model='ft:mistral-small-latest:5b697849:20240812:9bf8da03' suffix=None integrations=[WandbIntegrationOut(project='Test', TYPE='wandb', name=None, run_name=None)] trained_tokens=1048576 repositories=[] metadata=JobMetadataOut(expected_duration_seconds=272, cost=4.2, cost_currency='USD', train_tokens_per_step=131072, train_tokens=1048576, 
data_tokens=154255, estimated_start_time=None) events=[EventOut(name='status-updated', 
created_at=1723472716, data={'status': 'SUCCESS'}), EventOut(name='status-updated', created_at=1723472511, data={'status': 'RUNNING'}), EventOut(name='status-updated', created_at=1723472510, data={'status': 'QUEUED'}), EventOut(name='status-updated', created_at=1723472510, data={'status': 'VALIDATED'}), EventOut(name='status-updated', created_at=1723472510, data={'status': 'RUNNING'}), EventOut(name='status-updated', created_at=1723472509, data={'status': 'QUEUED'})] checkpoints=[]
Retrieved model information:
ft:mistral-small-latest:5b697849:20240812:9bf8da03
quaint knoll
#

interesting

#

oh its not the same job

#

could you provide the same job info?

edgy basin
#

oh ye sorry

#

i will edit

quaint knoll
#

thank you!

edgy basin
#

ok done updated

quaint knoll
#

hum? there is no metrics so its good?

#

does wdb come up with different metrics somehow?

edgy basin
#

there are no metrics but there are some information in w&b, graphs of giving information about what is happening in each cycle if im understanding it correctly, just like the one you have on the screenshot. I don't understand why I don't have the metrics.

#

w&b of this have these:

#

the log:

#

and bunch of files i haven't checked out

#

not sure what was helpful so i posted everything

quaint knoll
edgy basin
#

yes, its the same

#

from 2 hours ago

#

newest one

#

i only have 2 successful fine tunings and 3 failed ones

quaint knoll
#

I see thanks for all the feedback!!

dry bear
#

I'm working on this example: https://docs.mistral.ai/capabilities/finetuning/.

I keep running into errors from this section:

chat_response = client.chat.complete(
model = retrieved_jobs.fine_tuned_model,
messages = [{"role":'user', "content":'What is the best French cheese?'}]
)

The error I get is:
SDKError: API error occurred: Status 400
{"object":"error","message":"Invalid model: None","type":"invalid_model","param":null,"code":"1500"}

I suspect there's a clue in the previous command.

Code:
retrieved_jobs = client.fine_tuning.jobs.get(job_id = created_jobs.id)
pprint(retrieved_jobs)

Part of the output:
"events": [
{
"name": "status-updated",
"created_at": 1724116954,
"data": {
"error": "Failed validation. You have reached the maximum number of jobs you can run concurrently. Please try again later.",
"status": "FAILED_VALIDATION"
}
},

Any idea why I can't use retrieved_jobs.fine_tuned_model? Why is it an invalid model?

quaint knoll
dry bear
#

@quaint knoll
I didn't make any code changes, and it just started working. A fine-tuned model also showed up on la plateforme. How did this happen? Did you make any changes on your end?

client.models.list() returns the following: