#Also can someone explain how eleven labs

1 messages · Page 1 of 1 (latest)

frozen bramble
#

not sure where to congrat the devs -- probably in #💬│creators is good. w/r/t to agents -- you can use the UI and/or API. I think there is a part that goes from speech to text (what user is saying) -- then that text is fed to an LLM for a response (with help from tools/MCPs for example or the built-in RAG they have), then that response is text->speech and sent back to the user. I think this is how it is now (and not multimodal single text/voice model that acts like a monolith)

floral ether
#

My question is more about the use at all

frozen bramble
#

oh

floral ether
#

Because, with normal voices i can not give a background to IA

#

When generating my voice

#

I didnt exacly understand what are agents in the eleven labs context so, my question is

#

It's a model that i can customize, give context and background or

#

The use is more related to MCP

#

and outside services like market, talks and other

#

My goal is just voice generation for my content creation so

#

What i need is basicly a way to make the voice be more familiar with the context

frozen bramble
#

agent's essentially allow more control over that. and also settings about phone calls, interruptions, when to end, etc. along with some control over how the audio is made (how fast agent speaks, and all the usual text->speech settings you've seen before)

#

do you need more help understanding how the context is formed/used/ to support the agent -

#

i can do that

floral ether
#

i am using model v3 to creation

#

do u suggest me using v2 then?because on v3

#

i dont have

#

customization of the content

#

thats what i wanted

#

use agent

frozen bramble
#

you can use the agent if you want it to interact with you -- so like a conversation. it can figure out when you pause, or when it is interrupted, or like end the call (if you have that enabled), versus usual text to speech which is just text to speech -- not necessarily a "live conversation"

#

but you could do that, using llms on your own and another pipeline

#

but i'm using V3 voices in agents though

floral ether
#

What do you use for?

#

if i may ask

prime mural
#

Usually for customer support

frozen bramble
#

this week I have done the following:

  1. connected a phone number via twilio and 11labs together -- to answer questions about a particular service/rotation for post-docs
  2. connect the agent to a discord voice channel and allow people to ask it questions about Deep Rock Galactic via built-in RAG
  3. same as #2 above but for Pathfinder 2e (a cousin of mine plays it) -- involved siterip and stuff and RAG generation, etc.
floral ether
#

got it

#

Look like

frozen bramble
#

#3 I used tool calls, but also an MCP i made and am hosting on Heroku

floral ether
#

It's not useful for my goal

frozen bramble
#

that interacts with GCP Vector AI search

#

the agents are for agentic stuff --- instead of 1:1 user/response, it's meant to continue some task -- like wait and respond to further questions, or perform some goal. instead of simple text->speech

#

not sure if I explained that well sorry

floral ether
#

Yes you did

frozen bramble
#

they are good for things you expect in phone calls or support chats

#

they have some intelligence, a voice (pple like talking to stuff that sounds like humans I think), and an ability to have a knowledge base

#

like stuff related to the company, etc.

#

via built-in RAG, or MCP calls for whatever info or action, or tools (similar)

floral ether
#

well

#

Future is comming

#

I am excited to see whats the next eleven labs thing

frozen bramble
#

me2. their voice ai stuff is amazing and has been top of the line (at least in this sector/price range/accessibility) for a while

#

the agent stuff is just an attempt to adapt and expand with the ability to "connect" to other parts of the AI legos

#

it's attaching their great speech synch to other things to expand their ground

#

which makes sense

#

but competitors exist and there are other ways of going about this -- like doing a lot of the leg work and just purely using LLMs + their text->speech or vice versa

#

also pure monoliths that are multimodal models (they are trained on audio + text + stuff all together)

#

those can do a real good job in conversation (think openai advanced voice convos)

#

but latency might be higher and also to adjust that voice would quite another really good model probably to do speech change/transformation (one voice to another)

#

but i've been a member for a while, long time at the 99/month tier but then downgraded cuz I stopped using it, and now I'm back in action trying it out. TONS OF PROGRESS haha

floral ether
#

same

#

Honestly all my work is getting being automatic because AI

#

As a Game Developer, i just need to join steps

#

AI is making mostly things for me

frozen bramble
#

haha

#

this is where you recognize everyone else can also do that, and you can use your special skills/special sauce to do more than everyone else

#

since now we are going from a blunt tool to a sharper tool to a sharper tool haha