JS WS API sending audio | ElevenLabs | Page 1

grand hinge May 6, 2025, 2:27 PM

#

Got it, I didn't know you were reading from a file. Let me know if chunk size changes work for you

#

JS WS API sending audio

forest sleet May 6, 2025, 2:28 PM

#

it looks like it's got the same error from the websocket

#

The AI agent you are trying to reach appears to be misconfigured

grand hinge May 6, 2025, 2:31 PM

#

It could be better to do it via AudioContext.decodeAudioData
or better yet, use the JS SDK but I think it will only work for microphone input

forest sleet May 6, 2025, 2:32 PM

#

yeh the microphone input was the pain point, im getting data over another websocket

#

ill look into the decode audio data route

grand hinge May 6, 2025, 2:38 PM

#

If you are getting it via another ws, why write to a file?

forest sleet May 6, 2025, 2:38 PM

#

im not writing to a file, FileReader reads the blob into a base64 string

#

reader.readAsDataURL(blob);

grand hinge May 6, 2025, 2:39 PM

#

const arrayBuffer = await blob.arrayBuffer();

const audioContext = new AudioContext();

const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);

maybe smth like this

#

but it appears that you would get float32 so some conversion may be needed for int16

forest sleet May 6, 2025, 2:42 PM

#

i think i can prob do this, im resampling to 16000 sample rate anyhow so I can prob get away with the float32 conversation there.

#

ill give it a go, thank you!

#

actually in hindsight, this deosnt really help any I think

#

all it does is make it into an audio buffer i.e float32 PCM data

#

but ive alreayd got it at Float32 at this point

grand hinge May 6, 2025, 2:51 PM

#

were you converting it to int16 before?

forest sleet May 6, 2025, 2:52 PM

#

i actually decode it straight to Int16

grand hinge May 6, 2025, 2:52 PM

#

can you share the code that includes that and the chunking code?

forest sleet May 6, 2025, 2:52 PM

#

I do have to resample it from 48000 to 16000 but i do that on the Int16Array as the precision seems fine

#

the code that makes it Int16 ?

grand hinge May 6, 2025, 2:53 PM

#

conversion, resampling, chunking to 20ms

forest sleet May 6, 2025, 2:54 PM

#

so for the conversion, im getting opus data over the websocket, I am decoding that with https://github.com/ImagicTheCat/libopusjs

GitHub

GitHub - ImagicTheCat/libopusjs: a libopus API for JavaScript (wasm...

a libopus API for JavaScript (wasm/asm.js). Contribute to ImagicTheCat/libopusjs development by creating an account on GitHub.

#

that decodes like this:

async decodeOpusData(buffer: ArrayBuffer): Promise<Int16Array|void>{
    if(!this.decoder) {
        try  {
            this.decoder = new libopus.Decoder(1,48000);
        } catch (e) {
            console.error("Error creating OpusDecoder:", e);
        }
    }
    if (this.decoder) {
        this.decoder.input(buffer);
        return this.decoder.output();
    } else {
        console.error("Decoder not initialized");
    }
  }

#

then the output of that function goes into the input of this one:

async int16ArraysToWav(input, originalSampleRate = 48000, targetSampleRate = 16000, numChannels = 1){

            // Resample using linear interpolation on Int16
            const resampleRatio = targetSampleRate / originalSampleRate;
            const newLength = Math.floor(input.length * resampleRatio);
            const output = new Int16Array(newLength);

            for (let i = 0; i < newLength; i++) {
                const srcIndex = i / resampleRatio;
                const i0 = Math.floor(srcIndex);
                const i1 = Math.min(i0 + 1, input.length - 1);
                const w = srcIndex - i0;

                // Perform linear interpolation directly on Int16
                const sample = (1 - w) * input[i0] + w * input[i1];
                output[i] = Math.round(sample);
            }

            // WAV Header
            const subChunk2Size = output.byteLength;
            const chunkSize = 36 + subChunk2Size;
            const header = new ArrayBuffer(44);
            const view = new DataView(header);

            view.setUint32(0, 0x52494646, false); // 'RIFF'
            view.setUint32(4, chunkSize, true);
            view.setUint32(8, 0x57415645, false); // 'WAVE'
            view.setUint32(12, 0x666d7420, false); // 'fmt '
            view.setUint32(16, 16, true); // PCM
            view.setUint16(20, 1, true);  // PCM
            view.setUint16(22, numChannels, true);
            view.setUint32(24, targetSampleRate, true);
            view.setUint32(28, targetSampleRate * numChannels * 2, true);
            view.setUint16(32, numChannels * 2, true);
            view.setUint16(34, 16, true);
            view.setUint32(36, 0x64617461, false); // 'data'
            view.setUint32(40, subChunk2Size, true);

            const blob = new Blob([header, output], { type: 'audio/wav' }); 

            return new Promise((resolve, reject) => {
                const reader = new FileReader();
                reader.onloadend = () => {
                    const base64 = reader.result.split(',')[1];
                    resolve(base64);
                };
                reader.onerror = reject;
                reader.readAsDataURL(blob);
            });
        }

#

now this still includes the wav headers

grand hinge May 6, 2025, 2:59 PM

#

WAV headers present in the chunk should not end the connection, there'll be just some audio artifacting.
Why is this 16000 to 16000 conversion needed? Is the input audio Opus-encoded?

forest sleet May 6, 2025, 3:03 PM

#

oh thats just defaults, its actually 48000 to 16000

#

            const wav = await this.int16ArraysToWav(data.voiceData, 48000, 16000);

#

the input audio is opus encoded yes

#

the decoded audio is valid - i think??

#

thats a decode wav file

#

that's the 500ms sample after decoding and converted to a wav

grand hinge May 6, 2025, 3:27 PM

#

Looks ok to me. Have you already implemented converting this 500ms chunk into 20ms chunks?

forest sleet May 6, 2025, 3:38 PM

#

yep

#

same issue it seems

#

maybe @faint imp might see it (sorry for the ping!)

#

particularily #📟│agents-chat message

grand hinge May 6, 2025, 4:34 PM

#

If you can share the code that you use for chunking from 500ms to 20ms maybe we can fix it, other than that I have no other ideas

forest sleet May 6, 2025, 6:04 PM

#

I appreciate the help, I was actually batching before from samples that were already 20ms, it's the opus window size anyway so it fits.

#

So I just removed the batching so it's now just passing the 20ms ones straight through

#

thats the code above

grand hinge May 6, 2025, 6:34 PM

#

forest sleet So I just removed the batching so it's now just passing the 20ms ones straight t...

and it still doesn't work?

forest sleet May 6, 2025, 6:35 PM

#

unfortunately not

grand hinge May 6, 2025, 6:36 PM

#

ok then 😦 let's see if Angelo can take a look

forest sleet May 6, 2025, 6:37 PM

#

Thanks for all your help 🙏

#JS WS API sending audio