#Getting different results with o1-preview API vs Playground and ChatGPT

1 messages · Page 1 of 1 (latest)

native zealot
#

Hi team,

I am building an app that uses o1-preview. I have a fairly long prompt containing instructions which the model follows pretty well during tests in the Playgorund and ChatGPT. However I am getting awful results via API for the exact same prompt. I am not sending any parameters with the API call except the model name and user message as provided by the code generator in the Playground, see below:

prompt = os.getenv("PROMPT") response = await self.client.chat.completions.create( model="o1-preview-2024-09-12", messages=[ { "role": "user", "content": [ { "type": "text", "text": prompt } ] } ] )

Is there any reason why the results from the API would differ from the Playground and ChatGPT?

ocean pagodaBOT
#

Hi @native zealot. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Yes, there could be several reasons why you're seeing different results between the Playground, ChatGPT, and the API. Here are a few things to consider:

  1. Temperature Parameter: If the temperature parameter is set above 0, the model will likely produce different results each time - this is expected behavior. If you're seeing unexpected differences in the quality completions you receive from Playground vs. the API with temperature set to 0, there are a few potential causes to consider. source (https://help.openai.com/en/articles/6643200-why-am-i-getting-different-completions-on-playground-vs-the-api)

  2. Prompt Differences: Check that your prompt is exactly the same. Even slight differences, such as an extra space or newline character, can lead to different outputs. source (https://help.openai.com/en/articles/6643200-why-am-i-getting-different-completions-on-playground-vs-the-api)

  3. Parameter Differences: Ensure you're using the same parameters in both cases. For example, the model parameter set to gpt-3.5-turbo and gpt-4 will produce different completions even with the same prompt, because gpt-4 is a newer and more capable instruction-following model. source (https://help.openai.com/en/articles/6643200-why-am-i-getting-different-completions-on-playground-vs-the-api)

  4. Model Limitations: The OpenAI o1 models currently do not support function calling, structured outputs, streaming, support for system messages, and some other features. These limitations could potentially affect the results you get from the API. source (https://help.openai.com/en/articles/9855712-openai-o1-models-faq-chatgpt-enterprise-and-edu#h_dcb37e3f91)

#
  1. Prompt Caching: OpenAI routes API requests to servers that recently processed the same prompt, making it cheaper and faster than processing a prompt from scratch. This can reduce latency by up to 80% and cost by 50% for long prompts. Prompt Caching works automatically on all your API requests (no code changes required) and has no additional fees associated with it. source (https://platform.openai.com/docs/guides/prompt-caching/structuring-prompts)

If you've double-checked all of these things and are still seeing discrepancies, you might want to ask for help on the OpenAI Community Forum (https://community.openai.com/), where users may have experienced similar issues or may be able to assist in troubleshooting your specific case.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: