#pcm

1 messages · Page 1 of 1 (latest)

cedar scroll
#

I am trying to use the new output_format parameter but the output still seem to be a mp3, can you guys please provide a code sample?

modern jay
#

Same for me; I provide the string "pcm_16000" but get mp3 data out. It's a big deal because I have to further convert to mulaw, and streaming mp3 conversion is just a pain to deal with.

modern jay
#

@cedar scroll Heard in #🤖│api-chat that you have to be at the Independent Publisher tier to get pcm data. Don't have verification on that yet

cosmic tusk
modern jay
cosmic tusk
modern jay
#

So, wss://api.elevenlabs.io/v1/text-to-speech/{voice}/stream-input?model_id=elevenlabs_multilingual_v2&optimize_streaming_latency=3&output_format=pcm_16000

cosmic tusk
#

That looks fine to me. Have you tried different models? It definitely works for me through WebSocket with elevenlabs_monolingual_v1 and with optimize_streaming_latency set to 4.

modern jay
#

I can certainly try that. Right now the 1st 4 bytes I get are '5a000000' -- which indeed does not look like an MP3 header, nor do I see a leading 255 anywhere (1st MP3 header byte) -- but the audio player (using pydub) only plays it correctly if I send it in with format "mp3" ... "wav" or "pcm_s16le" both just give noise. It is supposed to be just linear pcm, right?

#

It'll be a couple hours, but I'll try a few different things