#Prediction

1 messages · Page 1 of 1 (latest)

vagrant wharf
#

How’s the probability distribution over the next sequence of tokens for an input sent through this interface? As opposed to a web application instance.

stray gladeBOT
#

Hi @vagrant wharf. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

The probability distribution over the next sequence of tokens for an input sent through the OpenAI API (such as the Chat Completions API) is determined by the model’s internal softmax over its vocabulary, conditioned on the input context. When you enable the logprobs parameter, the API returns the log probabilities (logprobs) for each output token, and optionally for the top N most likely tokens at each position (using top_logprobs). This allows you to inspect the model’s confidence and the distribution it considered for each token in the sequence.

  • The logprobs are the natural logarithm of the probability assigned to each token, so higher logprobs (closer to 0) indicate higher confidence. You can exponentiate these values to get the linear probability (e.g., logprob 0.0 = 100% probability, logprob -0.69 ≈ 50% probability) Using logprobs.
  • The API returns these distributions per token, and you can see the top alternatives and their probabilities at each step Using logprobs.
  • The sampling process (when not using temperature=0) is based on these probabilities, typically using softmax+temperature, so the distribution you see in logprobs reflects the model’s internal next-token probabilities Forum Post: How do logprobs work for chat completion API (for GPT-4.1).

There is no explicit difference in the underlying probability distribution between using the API and a web application instance (like ChatGPT in the browser), as both use the same model backend and sampling logic. The main difference is that the API allows you to directly access the logprobs and top_logprobs for each token, which is not typically exposed in the web UI.

#

In summary: the API exposes the model’s probability distribution over the next tokens via logprobs, and this distribution is the same as what the model uses internally for sampling, regardless of whether you access it via API or web application. The API simply gives you more direct access to these details.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon: