#Using the realtime API with audio input and text output hangs

1 messages · Page 1 of 1 (latest)

bitter tusk
#
  • If I add a conversation item with base64ed audio, and output modality "text", I never get a complete response, it just hangs.
  • Text-only input works fine.
  • Adding "audio" to the output modalities also works fine, but has much higher cost (especially since I just want text output).

Is this combination supported? If not, should the API produce an error instead?