Eleven-Labs-Benchmarking/elevenlabs_webs... | ElevenLabs | Page 1

wintry grotto Sep 17, 2023, 7:59 PM

#

We can do it in a thread here so as not to spam the main form.

prisma echo Sep 17, 2023, 8:00 PM

#

What exactly does this test test?

wintry grotto Sep 17, 2023, 8:00 PM

#

This is the test that you saw me running. It measures the amount of time for each part of the process. It shows you the chunks in real time being printed on your console, measures the latency between chunks, and the total amount of time the WebSocket is being run.

#

It takes away everything except for the raw response back from the server so there's really no other factors that could potentially be contaminant in the test results other than your network path to the server.

prisma echo Sep 17, 2023, 8:01 PM

#

Does it benchmark streaming input, output or both?

wintry grotto Sep 17, 2023, 8:02 PM

#

You can actually have it do anything and everything you want. I know it's a little confusing because I didn't really share this with anyone, but you can use the text chunker, which simulates an LLM input stream, or you can use where it says text chunk and you could send it all at once by using basically a delay time, which is the default of 0.0001.

#

If you have any trouble, I can get on a voice call and explain it with you, but that's basically what you need to know.

#

But I think this is a worthwhile test for you to run because that way you can see you can eliminate your specific network or location and see if you're getting these PCM chunks back as fast as advertised.

prisma echo Sep 17, 2023, 8:03 PM

#

So this uses a VPN?

wintry grotto Sep 17, 2023, 8:06 PM

#

No, this just uses your straight connection on your computer.

prisma echo Sep 17, 2023, 9:33 PM

#

I was confused earlier, I do make use of the input stream. As in I send the initial text chunk. Apologize for getting confused earlier

#

And looking at my input I see that from when I implemented it as the bos message docs may have changed so going to try and update it

#

You wait for a response from the 1st send then send eos?

#

I did make ammendments such that the bos message now has the Try Trigger Generation and GenerationConfig

#

Seems like you

Send BOS Message
Wait For Reponse
Get a response of IDK?
Then Send EOS MEssage

Then get chunks?

#

No wait you do
Send BOS
Send Input
Wait For a Response?
Send EOS

Wait For Chunks Responses?

prisma echo Sep 17, 2023, 10:53 PM

#

I saw a few room for improvements/modifications but they have not helped change latency yet

#

Calling EOS after all chunks are received or calling EOS after 1st chunk same latency issue

#

I also only create the websocket when I need to send text input then close it when I get all the chunks

#

So i'm trying to match the python code exactly

wintry grotto Sep 18, 2023, 1:49 AM

#

prisma echo No wait you do Send BOS Send Input Wait For a Response? Send EOS Wait For Chun...

EOS always comes when you're totally done and ready to close the connection.

prisma echo Sep 18, 2023, 1:53 AM

#

wintry grotto EOS always comes when you're totally done and ready to close the connection.

What would happen if you just call WebSocket clothes without calling EOS?

Anyways, I did try that order and technically have improved my code but it's still performs exactly the same

wintry grotto Sep 18, 2023, 1:53 AM

#

it would just stay open and you might not get all your chunks back. Part of how it knows to finish the generation is when you've instructed it that you've sent all the text and it's free to generate the remainder that are in the buffer.

prisma echo Sep 18, 2023, 1:54 AM

#

Wait a minute totally done to me. Says I got all my chunks and I'm done but what you just said makes it sound like I just sent it the input text and I'm done. Do you know which it is?

wintry grotto Sep 18, 2023, 1:58 AM

#

Well, that's exactly what I'm doing, is I'm sending the text and then I'm sending the end of speech in my simplified example if I'm not actually streaming the text in. There's really no reason to not send the EOS as soon as you sent the input text.

#

Like we had discussed on that call earlier, in the script I showed you, change line 10 from info to debug and then run the script again and you'll see the output in the console exactly what is happening when the WebSocket library and when the EOS message is being sent.

#

#

The purpose of the script is to help people understand exactly how the WebSocket API works. You can turn on debug mode to see the entire process in full detail, allowing you to understand each step of the way. That's essentially why I developed it, to address the confusion surrounding this topic.

#

@prisma echo I looked a little bit more into Unity. What library are you using for WebSockets? The default included library or a third-party library?

prisma echo Sep 18, 2023, 2:03 AM

#

wintry grotto <@142361971264716802> I looked a little bit more into Unity. What library are yo...

I'm using Unreal and I'm using the native websocket code provided in Unreal Engine.

Pretty sure that ive adjusted my code to match exactly the same calls that you're doing so far.

#

After going through it and kind of refreshing my mind, I can better walk you through what I'm doing as a comparison to what you're doing.

#

Think they pretty much match at the moment but my results were the same as before

wintry grotto Sep 18, 2023, 2:04 AM

#

Ah crap, for some reason I thought you were using Unity. Hold on, let me rethink this for a second.

#

Okay, apparently accomplishing this in Unreal is actually a lot more complex and you don't have async/await like in Python.

#

I don't know much about Unreal Engine, but I did some research and it seems like using the HTTP module to get the /stream API working might be a better option than dealing with the complexity of WebSockets. I understand that you mentioned having trouble getting it to work, but with the IHTTP library in Unreal, you might be able to achieve it using a sample code or something similar.

📎 11labsunrealstream.txt

prisma echo Sep 18, 2023, 2:37 AM

#

wintry grotto I don't know much about Unreal Engine, but I did some research and it seems like...

Hmm where did you find that?

#

Looking at the code it seems to do essentially what I was doing via BP.

The problem is when you get the OnResponseReceived, it'll fire once and that is it

#

I'll go ahead and entertain that exact code and see what happens

prisma echo Sep 18, 2023, 12:57 PM

#

@wintry grotto That code doesn't stream in, it just gets one response back with everything.

It is still faster then my PCM websocket chunking which is unusually slow for me.

#

If I can sort out why the websocket chunking is slow for PCM that I imagine would be better.

#

Unless I start chunking my text instead and get back the chunks as a stream but IDK if that will lose context 11Labs side if I break text up like that, might not generate as well

prisma echo Sep 18, 2023, 10:48 PM

#

Hey good news. It just started working

#

I don't know what I did or what change exactly made it work but it works now

pearl zenith Sep 19, 2023, 12:05 PM

#

@prisma echo I am running into the same problems as you with Eleven labs, unreal engine and runtime audio importer. I also saw your messages on the runtime audio importer discord. Is there any chance we can compare the steps we take? And did you get it working with PCM or MP3 in the end?

prisma echo Sep 19, 2023, 1:25 PM

#

pearl zenith <@142361971264716802> I am running into the same problems as you with Eleven lab...

What's your issue? As far as I can tell only the PCM solution will work with unreal for streaming.

If you don't care about streaming you can get mp3 working as well.

pearl zenith Sep 19, 2023, 3:05 PM

#

prisma echo What's your issue? As far as I can tell only the PCM solution will work with unr...

No I am using websockets because I need streaming indeed! And I was getting this weird stutter with MP3s so I am now streaming with PCM format. There is a little stutter still occasionally.. Way less now with PCM but it can be annoying

wintry grotto Sep 20, 2023, 12:46 AM

#

prisma echo Hey good news. It just started working

Wow. That's odd but glad to hear!!

wintry grotto Sep 20, 2023, 12:47 AM

#

pearl zenith No I am using websockets because I need streaming indeed! And I was getting this...

Unless you need "input streaming" better off with the regular /stream API for various reasons.

pearl zenith Sep 20, 2023, 8:18 AM

#

wintry grotto Unless you need "input streaming" better off with the regular /stream API for va...

Yes I thought so too but it seems Unreal's IHTTP does not support streaming APIs, it only returns the payload once the response is entirely complete. So it would be great to have an indication of what your process is @prisma echo, I'm beginning to think it might be an issue with the way I'm using Runtime Audio Importer

wintry grotto Sep 20, 2023, 10:38 AM

#

pearl zenith Yes I thought so too but it seems Unreal's IHTTP does not support streaming APIs...

If he use the code I shared maybe the answer lies there if that's what he got working. I'm wondering if for you unreal guys it would be easier to have some sort of local running proxy that could rebroadcast the chunks in a way that's more compatible.

prisma echo Sep 20, 2023, 12:00 PM

#

@wintry grotto @pearl zenith Yes streaming API isn't going to work in unreal the way 11Labs serves it up ATM.

I ended up getting websockets working.

It's usually more fine than not it seems ATM. Are their occasional flukes with web socket latency due to server usage spikes? I see folks complain from time to time.

An another alternative I have yet to try is using HTTP streaming but Unreal is the one to chunk it. Not sure if that'll have cons.

pearl zenith Sep 20, 2023, 12:05 PM

#

prisma echo <@751501076515258429> <@408625826893135873> Yes streaming API isn't going to wor...

I'm using websocket PCM with sample rate 14000 and there's just too much delay in between chunks causing a very annoying stutter every few words. I really don't know how to get it any better than it is, so if you have a smooth playback I'm curious what your setup is? Are you running websocket on a separate thread? Appending chunk by chunk to the streaming soundwave or buffering in between?

prisma echo Sep 20, 2023, 12:09 PM

#

pearl zenith I'm using websocket PCM with sample rate 14000 and there's just too much delay i...

I'm using Blueprint Websocket from unreal marketplace. It seems they are just using unreal's native web socket support. Idk what you might be using. I also had the same issue.

Last thing I did was try and copy the benchmark python code exactly.

pearl zenith Sep 20, 2023, 3:55 PM

#

@prisma echo so sorry to bother you with all these tags but are you sending the requests word by word or sentence by sentence?

prisma echo Sep 20, 2023, 4:10 PM

#

pearl zenith <@142361971264716802> so sorry to bother you with all these tags but are you sen...

I send sentence by sentence currently aka as long of a text message as I want.

#

Or I guess paragraph by paragraph

#

Can be as short or long as I need to but haven't tried anything too crazy

drifting zodiac Oct 30, 2023, 10:47 AM

#

pearl zenith I'm using websocket PCM with sample rate 14000 and there's just too much delay i...

Hi, I'm facing the same issue. Have you solved this? Would be grateful for some tips on fixing it

#Eleven-Labs-Benchmarking/elevenlabs_webs...