#session.updated
1 messages · Page 1 of 1 (latest)
Hi @rich plaza. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:
Yes, when you send a session.update event, the server will respond with a session.updated event that reflects the new, effective session configuration. This is explicitly stated in the documentation:
When the server receives a session.update, it will respond with a session.updated event showing the full, effective configuration. Only the fields that are present are updated. To clear a field like instructions, pass an empty string.
session.update (API reference)
You should wait for the session.updated event before proceeding with any processing that depends on the updated session configuration. The session.updated event serves as an acknowledgment that your changes have been applied, and it provides the current state of the session. This is also highlighted in the OpenAI Cookbook:
session.updated: After you send a session.update call. Acknowledges that the server applied new session settings. Inspect the echoed settings and update any local cache.
Context Summarization with Realtime API
In summary:
- Yes, session.updated will be delivered after session.update.
- You should wait for session.updated before relying on the new configuration in your application logic.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
When is the session.created event sent? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
The session.created event is sent immediately after a new connection is established, as the first server event. It confirms that the session is open and provides the session ID and default session configuration. This event is automatically emitted by the server to indicate the session is ready for use Realtime conversations: Session lifecycle events session.created (API reference) OpenAI Cookbook: Handle Incoming Events.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
s @limpid falcon Which one is sent first, session.created or session.updated?
Let me check the knowledge base for you again. One moment! :books:
The session.created event is always sent first, immediately after a new connection is established. The session.updated event is only sent after you send a session.update request to change the session configuration. Therefore, session.created is sent before session.updated Realtime conversations: Session lifecycle events OpenAI Cookbook: Handle Incoming Events.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@limpid falcon When is it okay to send input_audio_buffer.append?
Let me check the knowledge base for you again. One moment! :books:
You can send input_audio_buffer.append after the session has been established and you have received the session.created event from the server. This event confirms that the session is open and ready to receive audio data. Sending input_audio_buffer.append before the session is created is not supported, as the server is not yet ready to process audio input at that stage Context Summarization with Realtime API Realtime conversations: Session lifecycle events.
In summary:
- Wait for session.created before sending input_audio_buffer.append.
- After session.created, you can begin streaming audio chunks to the server using input_audio_buffer.append.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
@limpid falcon Do I need to wait for session.created?
Let me check the knowledge base for you again. One moment! :books:
Yes, you need to wait for the session.created event before sending audio data or other client events that depend on an active session. The session.created event is sent immediately after the WebSocket connection is established and confirms that the session is open and ready for further interaction. This is explicitly described in the documentation and examples, which show waiting for session.created before proceeding with session configuration or streaming audio Context Summarization with Realtime API Realtime conversations: Session lifecycle events.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
What happens if I send session.update before session.created arrives? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
The knowledge sources do not explicitly state what happens if you send a session.update event before session.created arrives. The documentation and examples consistently show waiting for session.created before sending further events like session.update or streaming audio, but they do not describe the server's behavior if session.update is sent prematurely Context Summarization with Realtime API Realtime conversations: Session lifecycle events.
Therefore, I cannot provide a definitive answer based on the available knowledge sources.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
Do I have to wait for session.created before I can send session.update? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
Based on the available knowledge sources, it is not explicitly stated whether you must wait for session.created before sending session.update. The documentation and examples consistently show waiting for session.created before sending session.update or other events, but they do not clearly specify if this is a strict requirement or what would happen if you send session.update earlier Context Summarization with Realtime API Realtime conversations: Session lifecycle events.
Therefore, I cannot confirm from the knowledge sources whether it is mandatory to wait for session.created before sending session.update.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
Is it explicitly stated whether we need to wait for session.updated before sending any other events? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
No, it is not explicitly stated in the knowledge sources that you must wait for session.updated before sending any other events. The documentation and examples recommend waiting for session.updated to ensure your session configuration changes have been applied, but they do not specify this as a strict requirement or describe what would happen if you send other events before receiving session.updated Context Summarization with Realtime API session.update (API reference).
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
Has there been any report of errors occurring? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
Yes, there have been reports of errors occurring with the Realtime API. For example:
- Users have reported receiving errors such as
Invalid type for ‘session.audio.input.format’: expected an object, but got a string instead.when migrating to the latest updates for the session.update event. This was due to a mismatch between the expected data structure and the documentation, and was resolved by sending an object with both type and rate fields for the format property (Forum Post: Realtime API Beta -> Realtime API GA - Receiving type error with session.audio.input.format). - There have also been reports of audio issues, such as distorted or missing audio (noise/static) with gpt-realtime and gpt-realtime-mini models in SIP Realtime calls. In these cases, the assistant generated responses, but the RTP audio stream was just noise or silence, despite the configuration being previously functional (Forum Post: [BUG?] SIP Realtime – distorted or missing audio).
- Another report described no inbound RTP from OpenAI media while TTS/transcript streamed normally, which was later confirmed as a backend issue and fixed by OpenAI support (Forum Post: Realtime SIP: No inbound RTP from OpenAI media while WS TTS runs).
These examples show that errors and issues have been encountered and discussed in the community.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
What automatically comes back when I send input_audio_buffer.append? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
When you send an input_audio_buffer.append event, the server does not send a confirmation response to this event. This is explicitly stated in the documentation:
Unlike most other client events, the server will not send a confirmation response to this event input_audio_buffer.append (API reference).
The only time you will receive a related server event is when speech is detected and committed (if VAD is enabled), at which point you may receive events like input_audio_buffer.committed, but not as a direct response to each append.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
What happens when I send input_audio_buffer.commit? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
When you send an input_audio_buffer.commit event, the server will create a new user message item in the conversation from the current audio buffer. This will trigger input audio transcription (if enabled in the session configuration), but it will not automatically create a response from the model. The server will respond with an input_audio_buffer.committed event, which includes the ID of the new user message item. If the input audio buffer is empty, the server will return an error input_audio_buffer.commit (API reference) input_audio_buffer.committed (API reference).
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
When you call input_audio_buffer.commit, is all audio up to that point cleared? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
Yes, when you call input_audio_buffer.commit, all audio in the buffer up to that point is used to create a new user message item in the conversation, and the buffer is cleared. This is stated in the documentation:
A "commit" will create a new user message item in the conversation history from the buffer content and clear the buffer.
input_audio_buffer.append (API reference)
So, after committing, the buffer is empty and ready for new audio data.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
What error do we get if there is no audio? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
If you send an input_audio_buffer.commit event when the input audio buffer is empty, the server will produce an error. However, the specific error message or code is not provided in the available knowledge sources input_audio_buffer.commit (API reference).
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
If there is audio but it's only noise, will it cause an error? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
If there is audio in the buffer but it is only noise (not silence), the knowledge sources do not state that this will cause an error. Instead, the model may attempt to transcribe or interpret the noisy audio. In cases of unintelligible or unclear audio, the model might respond with a clarification request or similar behavior, but this is not treated as an error by the API itself cookbook.openai.com: No Audio or Unclear Audio.
Errors are only explicitly mentioned when the audio buffer is empty at the time of commit, not when the buffer contains noise or unintelligible audio input_audio_buffer.commit (API reference).
In summary:
- Submitting only noise will not cause an API error, but may result in the model asking for clarification or producing an unclear transcription.
- An error is only produced if the buffer is empty when committing.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
At minimum, how much audio data do I need? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
The knowledge sources do not explicitly state the minimum amount of audio data required. However, they do mention that sending input_audio_buffer.commit with an empty buffer will result in an error, which implies that at least some non-empty audio data is required input_audio_buffer.commit (API reference). There is no specific minimum duration or size documented—only that the buffer must not be empty.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
Wasn't there something written somewhere saying we need around 100 ms? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
I'm sorry, but the knowledge sources do not explicitly state that you need around 100 ms of audio as a minimum. The documentation only specifies that the input audio buffer must not be empty when committing, but does not mention a required minimum duration such as 100 ms input_audio_buffer.commit (API reference).
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
What about the default settings for VAD? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
The default settings for Voice Activity Detection (VAD) in the Realtime API are as follows:
- VAD is enabled by default in Realtime sessions.
- The default mode is server_vad, which uses periods of silence to detect when the user has stopped speaking.
- Default parameters for server_vad are:
threshold: 0.5 (activation threshold; higher values require louder audio to activate)prefix_padding_ms: 300 ms (amount of audio included before detected speech)silence_duration_ms: 500 ms (duration of silence to detect speech stop)
You can adjust these parameters as needed, but these are the defaults if you do not specify otherwise Voice activity detection (VAD).
If you want to disable VAD, you can set the turn_detection property to null Realtime transcription: Voice activity detection.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
When I send input_audio_buffer.commit, do events arrive afterward? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
Yes, when you send an input_audio_buffer.commit event, the server will respond with an input_audio_buffer.committed event, which includes the ID of the new user message item that will be created. After this, you may also receive additional events related to the processing of the committed audio, such as transcription events (conversation.item.input_audio_transcription.delta and conversation.item.input_audio_transcription.completed) if transcription is enabled in your session configuration input_audio_buffer.commit (API reference) input_audio_buffer.committed (API reference) Handling transcriptions.
In summary:
- You will receive
input_audio_buffer.committedafter committing. - If transcription is enabled, you will also receive transcription events for the audio you committed.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
If I write "input_audio_transcription": {"model": "whisper-1"}, does that make enabled=true automatically? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
The knowledge sources do not explicitly mention an enabled or true flag for input_audio_transcription. However, according to the API documentation, including the transcription configuration (such as "input_audio_transcription": {"model": "whisper-1"}) in your session object enables transcription for input audio. There is no separate enabled flag; the presence of the transcription object with a model specified is what activates transcription input_audio_buffer.commit (API reference) Realtime transcription guide.
So, yes: specifying "input_audio_transcription": {"model": "whisper-1"} enables transcription automatically.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
Is this the correct format? { "type": "session.update", "session": { "modalities": ["text"], "input_audio_transcription": {"model": "whisper-1"}, "turn_detection": None, "instructions": ( "something." ), }, } @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
Your provided format is close, but there are a few issues based on the official documentation:
-
Field Names and Structure:
- The correct field for specifying modalities is typically
modalitiesoroutput_modalitiesdepending on the API version and context. - The
turn_detectionfield should be set tonull(JSON) rather thanNone(Python). - The
instructionsfield should be a string, not a tuple or parenthesized value. - All keys and string values should use double quotes (
"), and the JSON should not contain trailing commas.
- The correct field for specifying modalities is typically
-
Example of Correct JSON Format:
Here is a corrected version based on the knowledge sources:
{
"type": "session.update",
"session": {
"modalities": ["text"],
"input_audio_transcription": {
"model": "whisper-1"
},
"turn_detection": null,
"instructions": "something."
}
}
Key points:
- Use
nullfor fields you want to clear or disable. - Ensure all JSON syntax is correct (no Python
None, no parentheses for strings, no trailing commas). - The field name
modalitiesis used in some contexts, but in others, you may needoutput_modalitiesdepending on the endpoint and API version. Double-check which is appropriate for your use case (session.update API reference).
If you follow this structure, your request will be in the correct format according to the documentation.
Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:
@rich plaza Kapa.ai is still learning and improving, please let me know how I did by reacting below.
what's rate_limits.updated? @limpid falcon
Let me check the knowledge base for you again. One moment! :books:
The rate_limits.updated event is a server event emitted at the beginning of a response to indicate the updated rate limits for your session. When a response is created, some tokens are "reserved" for the output, and the rate limits shown in this event reflect that reservation. The event includes details such as the type of rate limit (e.g., requests, tokens), the maximum allowed value, the remaining value before the limit is reached, and the time in seconds until the rate limit resets.
Example payload:
{
"event_id": "event_5758",
"type": "rate_limits.updated",
"rate_limits": [
{
"name": "requests",
"limit": 1000,
"remaining": 999,
"reset_seconds": 60
},
{
"name": "tokens",
"limit": 50000,
"remaining": 49950,
"reset_seconds": 60
}
]
}