#How to setup transcription on Realtime API with SIP

1 messages · Page 1 of 1 (latest)

ripe pine
#

I’ve been testing the new Realtime API with SIP integration over Twilio, the realtime conversation part works just fine. However, I haven’t figured out how to get the transcription for the audio input. I can get the full transcription of the model’s response, but I’m unable to retrieve the transcription from the user .

This is the only event related to the transcription that I receive:

{"id":"item_C9fNHe56u8NI1EJTiv4Q9","type":"message","status":"completed","role":"user","content":[{"type":"input_audio","transcript":null}]}}

I’ve tried sending the session.update event to set up the transcription model, but this doesn’t seem to work.

system_update = {
"type": "session.update",
"session": {
"input_audio_transcription": {
"model": "gpt-4o-transcribe",
"language": "es",
"prompt": "",
},
}
}

twin marlin