#Getting no response event for input_audio_transcription in realtime ws

1 messages · Page 1 of 1 (latest)

keen tide
#

Even though i have set

      "model": "whisper-1"
    },

in session.update, am not getting any events back from openai like: conversation.item.input_audio_transcription.completed and conversation.item.input_audio_transcription.failed

mortal duneBOT
#

Hi @keen tide. I'm kapa.ai, your friendly support bot. I will try to answer your question by searching through the knowledge base. :wave:

#

Based on the information provided in the knowledge sources, it's not clear why you're not receiving the conversation.item.input_audio_transcription.completed and conversation.item.input_audio_transcription.failed events. The sources list these events as part of the Realtime API, but they don't provide specific troubleshooting steps for your situation.

However, here are a few general suggestions:

  1. Ensure that your WebSocket connection is stable and open during the entire session.
  2. Check if other events like session.created, session.updated, conversation.created are being received to confirm that the issue is specifically with the transcription events.
  3. Make sure that the audio input being provided is valid and in the correct format.

If you're still facing issues, I would recommend reaching out to OpenAI support or checking the OpenAI documentation (https://platform.openai.com/docs/) for more detailed information on how to use the Realtime API and troubleshoot potential issues.

Note: you can ask me a follow up question by @ mentioning me again :speech_balloon:

#

@keen tide Kapa.ai is still learning and improving, please let me know how I did by reacting below.

keen tide
#

session is updated

Received event: {
  "type": "session.updated",
  "event_id": "event_AJi4r6qZc0I4cfDB8RhJy",
  "session": {
    "id": "sess_AJi4q1sDoHLOxfALarHqM",
    "object": "realtime.session",
    "model": "gpt-4o-realtime-preview-2024-10-01",
    "expires_at": 1729261928,
    "modalities": [
      "audio",
      "text"
    ],
    "instructions": "You are a helpful, witty, and friendly AI. Act like a human, but remember that you aren't a human and that you can't do human things in the real world. Your voice and personality should be warm and engaging, with a lively and playful tone. Talk quickly.",
    "voice": "alloy",
    "turn_detection": {
      "type": "server_vad",
      "threshold": 0.5,
      "prefix_padding_ms": 300,
      "silence_duration_ms": 800
    },
    "input_audio_format": "pcm16",
    "output_audio_format": "pcm16",
    "input_audio_transcription": {
      "model": "whisper-1"
    },
    "tool_choice": "auto",
    "temperature": 0.8,
    "max_response_output_tokens": "inf",
    "tools": []
  }
}
#

yet i dont get input transcription

am also getting