#Finetuning API

1 messages · Page 1 of 1 (latest)

fickle slate
#

When you finetune, does the system automatically shuffle the data you provided?

molten fernBOT
#

Hi @fickle slate. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Hmm, I don't know enough to give you a confident answer yet. The provided knowledge sources do not specify whether the fine-tuning process in OpenAI's API automatically shuffles the data. For more detailed information about the fine-tuning process, you may want to refer to the OpenAI fine-tuning guide (https://platform.openai.com/docs/guides/fine-tuning).

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

fickle slate
#

@molten fern please check forum posts

molten fernBOT
#

Let me check the knowledge base for you. One moment! :books:

#

I'm sorry, but your request is not clear. Could you please provide more details? Are you looking for specific information from the forum posts or do you need help with a particular issue related to OpenAI's products or services? The more specific you can be with your question, the better I can assist you.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@fickle slate kapa.ai is still learning and improving, please let us know how it did by reacting below

fickle slate
#

@molten fern please check forum posts on whether the system automatically shuffle the data you provided when you finetune

molten fernBOT
#

Let me check the knowledge base for you. One moment! :books:

#

Based on the provided knowledge sources, there is no explicit mention of the system automatically shuffling the data during the fine-tuning process. However, in one of the examples provided in the OpenAI Cookbook (https://cookbook.openai.com/examples/multiclass_classification_for_transactions#building-fine-tuned-classifier), the data is manually shuffled before being saved to a .jsonl file for fine-tuning. Here's the relevant code snippet:

import random

ft_df['ordering'] = ft_df.apply(lambda x: random.randint(0,len(ft_df)), axis = 1)
ft_df.set_index('ordering',inplace=True)
ft_df_sorted = ft_df.sort_index(ascending=True)
ft_df_sorted.to_json("transactions_grouped.jsonl", orient='records', lines=True)

In this example, a random ordering is applied to the DataFrame before it is saved for fine-tuning. This suggests that you may need to shuffle your data manually before fine-tuning, but it's always best to check the official documentation or reach out to OpenAI support for confirmation.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@fickle slate kapa.ai is still learning and improving, please let us know how it did by reacting below