How to play the audio received from elvenlabs websocket api continuusly without delays or chunks in | ElevenLabs | Page 1

robust anvilBOT Nov 3, 2023, 11:00 AM

#

@compact steeple Please provide more information related to your query. One of our moderators will help you out soon.

charlie.gaynor

How to play the audio received from elvenlabs websocket api continuusly without delays or chunks in between the audio files

Account Created

Mon, 02 October 2023, 07:13 PM UTC

compact steeple Nov 3, 2023, 11:16 AM

#

Ok so basically elevenlabs websocket sends the audio in chunks via websocket. I can get it work using the python demo, where you play straight from the stream straight from the listen function here https://docs.elevenlabs.io/api-reference/text-to-speech-websockets

But I'm struggling when:
I need to send the audio recieved over websocket to a client, and then have the client play it

Whether that client is in python or javascript. I can hear the little delay between playing the consecutive audio chunks, if I do play the audio (bytes) I receive from elevenlabs. Sometimes that delay is in the middle of the words. Any advice on this? 🙂

ElevenLabs

Text to Speech Websockets - ElevenLabs

gleaming barn Nov 5, 2023, 6:02 AM

#

I have a client on Swift and the same problem, a lot of people facing it too, here are two more reports on this #1166469137548189706 message

#1153057573629603901 message

Elevenlabs just ghosting all of this, they don’t seem to be interested in any support or feedback on this

compact steeple Nov 5, 2023, 8:13 PM

#

Super sad as it's a big challenge to using the audio in realtime 😦

gleaming barn Nov 6, 2023, 6:48 PM

#

@compact steeple Good news that openai now have tts with $0.015 per 1K charecters, so we can forget this websockets bs

compact steeple Nov 6, 2023, 7:24 PM

#

No voice cloning yet though 😦 @gleaming barn

#

Think of me stuck in the weeds when you're rocking it with openai xD

dense sable Nov 7, 2023, 7:38 AM

#

@compact steeple When I was implementing playback in JavaScript, I had gaps when using MP3 response. These gaps were not there when using MediaSource API but since MediaSource is not available on IOS, I had to use Web Audio API and there I used PCM audio response instead.

compact steeple Nov 7, 2023, 9:12 AM

#

I'll give it a whirl thank you 🙂

viscid hare Nov 7, 2023, 10:18 AM

#

gleaming barn <@456226577798135808> Good news that openai now have tts with $0.015 per 1K char...

OpenAI TTS doesn't have any API for streaming input. If you don't need that, why were you using the WebSocket API?

dense sable Nov 7, 2023, 10:42 AM

#

@gleaming barn I created a capacitor plugin for IOS to play base64 chunks. Maybe you can use the code to see if it helps. I am using PCM_44100 output tho.

Go to https://npmjs.com/package/capacitor-streaming-audio then go to code and click on 'ios/Plugin' folder. There you will see 'StreamingAudio.swift' file which has playback implementation. I am not iOS developer so code might not be good quality but maybe it can help you.

npm

capacitor-streaming-audio

This plugin implements PCM audio chunks playback specifically for elevenlabs websocket streaming. It makes it possible to seamlessly play Base64 PCM 24khz sample rate 16bit signed little endian audio data chunks in sequence. It uses AVQueuePlayer on IOS a. Latest version: 0.0.4, last published: 8 days ago. Start using capacitor-streaming-audio i...

gleaming barn Nov 7, 2023, 10:44 AM

#

viscid hare OpenAI TTS doesn't have any API for streaming input. If you don't need that, why...

We initially used WebSockets to optimize the voiceover delay for real-time conversations with LLM. Eventually, we transitioned to feeding it sentence by sentence through another endpoint, which offered similar delay and enhanced stability. Therefore, we are now migrating to OpenAI, utilizing the same principle.

dense sable Nov 7, 2023, 10:46 AM

#

@gleaming barn I had to add initial buffer of 2-3 chunks to start playback. It adds small delay but then audio is seamless. Although slow network can sometime cause gaps which is inevitable.

gleaming barn Nov 7, 2023, 10:51 AM

#

dense sable <@694924217845678180> I had to add initial buffer of 2-3 chunks to start playbac...

Thanks for sharing! We might need some elevenlabs customization features in the future so might give it a shot!

#How to play the audio received from elvenlabs websocket api continuusly without delays or chunks in