what s deepgram s language support look | ElevenLabs | Page 1

neon halo Sep 25, 2023, 1:23 AM

#

30 plus major languages and 100 plus for translation. It's actually quite powerful. If you set it up correctly, the latency is so good that I have enough headroom to incorporate a separate translation stage. You can see my example here, which I call the "FaF" transcriber program I wrote in Python. Even though I'm making a second API call for translations, it actually produces results faster than the listener can hear the words compared to what they can see printed from the streaming response in the console.

#

As far as submitting complete sentences, I'm mostly doing the same thing. However, I'm using a special NLP model that pre-processes the text and splits it up into the most latency-efficient form for 11 labs. I actually got the idea from your attempt at creating a fast script using NLTK, but I found it to be too slow and ineffective.

quasi igloo Sep 25, 2023, 1:30 AM

#

neon halo 30 plus major languages and 100 plus for translation. It's actually quite powerf...

yeah, right now I'm doing faster-whisper > deepL > elevenlabs, which is rather slow

#

I'd rather stick wtih deepL as a translation stage as it's simply higher quality than most other offerings I've found

#

but deepgram seems like a good idea to integrate as well

neon halo Sep 25, 2023, 1:31 AM

#

What is your use case? You're just trying to basically take some sort of stream of audio in one language, reproduce it in another, and have it be spoken by 11 labs?

quasi igloo Sep 25, 2023, 1:31 AM

#

pretty much

neon halo Sep 25, 2023, 1:32 AM

#

Ah Yeah, that's where it gets complex because you can't just trim sentences like you can in English because of the various grammatical structures of different languages, so you have to sort of have a little bit of a buffer time.

quasi igloo Sep 25, 2023, 1:32 AM

#

yeah, it's pretty complex due to all the non-english fun times.

#

so I can't just get back the partial results from deepgram

#

I'd need to wait for the complete sentences still

#

so my best bet is probably to still stick with whisper

#

then again, maybe something can be done on the audio end

neon halo Sep 25, 2023, 1:34 AM

#

Maybe, but there's actually room to improve it because DeepGram has some unique utterance and endpointing systems that you can tweak where you can probably get some good results in your target languages.

quasi igloo Sep 25, 2023, 1:34 AM

#

if I can just detect when a sentence ends from the audio, I wouldn't really need to do the text processing

#

eh, maybe something to look into

neon halo Sep 25, 2023, 1:34 AM

#

Yeah, DeepGram has punctuation ability, so I would look into it.

#

It's a more nuanced challenge, but I'm quite certain with some focused effort you could get it to be pretty fast.

quasi igloo Sep 25, 2023, 1:35 AM

#

neon halo Maybe, but there's actually room to improve it because DeepGram has some unique ...

ideally I'd like to target all the languages supported by whatever I'm using so that's... probably not something I can tackle

#

I think I'll just try to figure out how much latency whisper is contributing to begin with, then go from there

#

if deepL/elevenlabs turn out to be like 80% of the latency anyway, not much point in trying to shave off the little amount I'd save

neon halo Sep 25, 2023, 1:38 AM

#

I'll tell you what, I'll DM you a zip file with some of my custom optimization tools I've made for DeepGram, which you can switch the languages around and play with the settings and it should be pretty quick for you to get up and running and you can see what sort of difference it makes.

quasi igloo Sep 25, 2023, 1:39 AM

#

oh, cool stuff, thanks

neon halo Sep 25, 2023, 1:42 AM

#

Happy to share. I've learned a lot from you.

#what s deepgram s language support look