hello | ElevenLabs | Page 1

dark moatBOT Aug 18, 2025, 1:23 PM

#

AI-generated Answer for baymurat1

Hi! How can I help you with ElevenLabs products or API today? Please tell me what you’re trying to do, and I’ll point you to the exact docs and steps.

unreal peak Aug 18, 2025, 1:23 PM

#

Is this setting valid? tts: {
voiceId:
version === 2 ? PS_CONFIG.voiceIdMandarin : PS_CONFIG.voiceId,
settings: {
stability: 0.6,
similarity_boost: 0.95,
style: 0.0,
use_speaker_boost: true,
},
},

in conversation AI?

dark moatBOT Aug 18, 2025, 1:23 PM

#

AI-generated Answer for baymurat1

Short answer: No, not in Conversational AI as written.

Why:

In Conversation AI, conversation_config.tts settings are flat fields (voice_id, stability, similarity_boost, speed, optimize_streaming_latency, etc.). A nested settings object under tts is not documented. (elevenlabs.io)
Conversation overrides (conversation_config_override) support overriding tts.voice_id during the WebSocket handshake; voice settings are not shown as overrideable there. (elevenlabs.io)

What to do instead:

To change TTS behavior for an agent, set these fields directly under conversation_config.tts on the agent (for example: stability: 0.6, similarity_boost: 0.95, speed: 1.0). (elevenlabs.io)
If you need style or use_speaker_boost, those are voice-level settings; update the Voice’s settings via the Voices Settings API (stability, similarity_boost, style, use_speaker_boost, speed). Note these are documented for voices, not as agent-level TTS fields. (elevenlabs.io)
Multi-voice agents can also set per-voice stability/speed/similarity_boost and streaming latency in supported_voices. (elevenlabs.io)

So, replace your tts.settings object with top-level fields on tts for the agent, and use the Voices Settings API if you specifically need style or use_speaker_boost. (elevenlabs.io)

unreal peak Aug 18, 2025, 1:27 PM

#

can this change on website?

dark moatBOT Aug 18, 2025, 1:27 PM

#

AI-generated Answer for baymurat1

Yes. Those exact parameters can be changed in the ElevenLabs web app via Voice Settings:

stability → “Stability” slider; similarity_boost → “Clarity + Similarity Enhancement”; style → “Style exaggeration”; use_speaker_boost → “Speaker Boost.” (elevenlabs.io)
You can adjust these in the Text to Speech (Playground) UI. (elevenlabs.io)
In Studio, you can edit stability, similarity, speaker boost (and more) per paragraph or for all text using that voice. (help.elevenlabs.io)
In Conversational AI, agent voice settings (including stability/similarity/speed) are configurable; multi-voice agents support per‑voice settings. (elevenlabs.io)

Speed is also available in the web UI (Speech Synthesis, Studio, and Conversational AI). (help.elevenlabs.io)

Sources:
Get default voice settings | ElevenLabs Documentation
Text to Speech (product guide) | ElevenLabs Documentation
How can I change the voice and settings across multiple paragraphs in Studio? – ElevenLabs
Conversational AI voice design guide | ElevenLabs Documentation
Can I change the pace of the voice? – ElevenLabs

unreal peak Aug 18, 2025, 1:29 PM

#

voice variations are not stable. How can I fix this issue?

dark moatBOT Aug 18, 2025, 1:29 PM

#

AI-generated Answer for baymurat1

Short answer: some variation is expected because ElevenLabs TTS is nondeterministic, but you can greatly tighten consistency with the steps below.

What to do

Increase stability and similarity, keep style at 0, and optionally enable speaker boost. These settings directly control variability and adherence to the target voice. Higher stability narrows randomness; style exaggeration can make output less stable. (elevenlabs.io)
Use a fixed seed for deterministic re-generations:
- REST TTS: pass the seed parameter in the Create speech request. (elevenlabs.io)
- WebSocket TTS: set the seed query/body parameter on the stream-input endpoint. (elevenlabs.io)
If maximum consistency matters, prefer the standard HTTP TTS over WebSocket streaming (streaming trades some consistency for low latency). (help.elevenlabs.io)
Keep generations short. Break long text into chunks under ~800–900 characters to reduce drift and artifacts. (elevenlabs.io)
If you’re using a cloned voice, ensure training audio is clean and consistent (no background noise, consistent mic distance, compressed to a stable loudness). Poor or inconsistent training clips lead to unstable output. (elevenlabs.io)
Use SSML <break> sparingly; excessive breaks can cause speed/instability. Limit to a few per paragraph if you need them. (elevenlabs.io)
Be aware that some voices are inherently more stable than others due to their source data. If issues persist, try an alternate voice or a professionally cloned voice. (elevenlabs.io)
For Conversational AI agents (including multi-voice), set per-voice stability/similarity/speed in the agent’s voice settings to keep outputs consistent. (elevenlabs.io)

If you share whether you’re using REST, WebSocket streaming, or Conversational AI, I can point you to the exact parameters to set.

Sources:
Edit voice settings | ElevenLabs Documentation
Create speech | ElevenLabs Documentation
WebSocket | ElevenLabs Documentation
What is the Text-to-Speech WebSockets API? – ElevenLabs
Troubleshooting | ElevenLabs Documentation
Troubleshooting | ElevenLabs Documentation
Studio overview | ElevenLabs Documentation
Studio overview | ElevenLabs Documentation
Changelog | ElevenLabs Documentation

unreal peak Aug 18, 2025, 1:32 PM

#

no, I am using conversation AI

dark moatBOT Aug 18, 2025, 1:32 PM

#

AI-generated Answer for baymurat1

Got it—since you’re using Conversational AI:

Your settings object is valid. ElevenLabs supports stability, similarity_boost, style, use_speaker_boost (and speed) as voice settings; you can set them via API (Edit voice settings) or when initializing real‑time generation. (elevenlabs.io)
Where to change this on the website: Conversational AI exposes “Voice settings”; at minimum, Speed is configurable in Conversational AI from the web UI. (help.elevenlabs.io)

To reduce unstable voice variations in Conversational AI:

Increase stability and similarity_boost. Low stability and low similarity increase variation between generations. (help.elevenlabs.io)
Keep style at 0. Style exaggeration can make the model less stable and adds latency; ElevenLabs generally recommends 0. (elevenlabs.io)
Consider model choice vs latency: Conversational use runs on Flash v2.5 (ultra‑low latency, slightly lower expressiveness) or Turbo v2.5 (low latency, higher quality). Pick the one that best fits your quality/latency needs. (help.elevenlabs.io)
Use a voice native to your target language/accent to avoid accent drift in live conversations. (help.elevenlabs.io)

For reference, ElevenLabs provides a Conversational AI voice design guide with practical ranges (e.g., higher stability for consistency, 0.9–1.1 speed for natural pacing). (elevenlabs.io)

If you’d like, share whether you’re configuring via the web UI or the API; I can point to the exact call or panel. (elevenlabs.io)

Sources:
Edit voice settings | ElevenLabs Documentation
Can I change the pace of the voice? – ElevenLabs
Why does my voice sound different in the API compared to the website? – ElevenLabs
Text to Speech (product guide) | ElevenLabs Documentation
Do you offer an AI model for conversational purposes or for chatbots? – ElevenLabs
Why does my voice change accent or language? – ElevenLabs
Conversational AI voice design guide | ElevenLabs Documentation
Edit voice settings | ElevenLabs Documentation

unreal peak Aug 18, 2025, 1:38 PM

#

which changes do you suggest here?

"stability": 1,
"use_speaker_boost": true,
"similarity_boost": 1,
"style": 0,
"speed": 1

dark moatBOT Aug 18, 2025, 1:38 PM

#

AI-generated Answer for baymurat1

Short answer: your JSON already matches ElevenLabs’ default voice settings. If you’re hearing unstable delivery, you can’t make it fully deterministic, but the following tweaks generally reduce perceived variability while avoiding artifacts:

stability: try 0.75–0.85 instead of 1.0 to keep consistency without over‑flattening delivery. Higher values give more consistent output; the 0.60–0.85 band is recommended for consistent conversational voices. (elevenlabs.io)
similarity_boost: keep high but not max; try 0.90–0.95. Very high values can introduce distortions. (elevenlabs.io)
style: keep at 0; non‑zero values can reduce stability and add latency. This is the recommended default. (elevenlabs.io)
use_speaker_boost: true is fine for slightly more speaker similarity; note it adds a bit of latency. (elevenlabs.io)
speed: keep near 1.0 (typical natural range is 0.9–1.1; supported range 0.7–1.2). (elevenlabs.io)

Important expectations: ElevenLabs TTS is non‑deterministic, so even with the same settings you will get small delivery variations between generations; the stability slider narrows that range but cannot eliminate it. (elevenlabs.io)

Reference: The defaults you posted are exactly the API’s default voice settings. (elevenlabs.io)

Sources:
Conversational AI voice design guide | ElevenLabs Documentation
Conversational AI voice design guide | ElevenLabs Documentation
Text to Speech (product guide) | ElevenLabs Documentation
Text to Speech (product guide) | ElevenLabs Documentation
Conversational AI voice design guide | ElevenLabs Documentation
Text to Speech (product guide) | ElevenLabs Documentation
Get default voice settings | ElevenLabs Documentation

unreal peak Aug 18, 2025, 1:54 PM

#

even though I make such changes, but I don't see any improvement there

dark moatBOT Aug 18, 2025, 1:54 PM

#

AI-generated Answer for baymurat1

Thanks for the details. In ElevenLabs Conversational AI, some of the fields in your snippet aren’t applied at the agent level, which is why you may see no change.

What actually applies in Conversational AI agent TTS

Agent TTS settings support: stability, similarity_boost, speed, and optimize_streaming_latency. Fields like style and use_speaker_boost are not part of the agent TTS schema and will be ignored there. You can verify what’s set by calling Get Agent; note the TTS block only includes those fields.

If you need to change style/use_speaker_boost

Edit the underlying voice’s default settings via the Voices API (these are global per voice). Use Edit voice settings to set stability, similarity_boost, style, use_speaker_boost, speed.

Why “voice variations” still happen

TTS is non-deterministic; the sliders act as ranges rather than guarantees, so identical inputs can still vary. Raising stability improves consistency but can reduce expressiveness; higher similarity_boost improves adherence to the voice but very high values may introduce artifacts.

Concrete steps to improve consistency in Conversational AI

Set only supported agent fields and confirm they’re saved:
- stability: try 0.60–0.85 for more consistent delivery
- similarity_boost: keep high (e.g., 0.8–1.0) but back off if you hear artifacts
- speed: keep near 1.0; supported range is 0.7–1.2 and extremes can affect quality
  Verify with GET /v1/convai/agents/:agent_id.
If you also want style/use_speaker_boost, update the voice via the Voices API (or in the web app’s voice settings) because the agent won’t read those fields from its own TTS config.
Leave optimize_streaming_latency at 0 for best quality; it trades quality for latency and is deprecated on TTS endpoints.
For consistent pronunciations of specific terms, attach a pronunciation dictionary in the agent TTS config.

Website changes

Yes—speed, stability, and similarity can be adjusted in the agent’s Voice settings in the Conversational AI UI; speed range and behavior match the docs.

If you apply the above and still don’t see improvements, please share your agent_id so you can GET the current config and confirm the effective values.

unreal peak Aug 18, 2025, 2:00 PM

#

the issue is happing in Multi-voice support

dark moatBOT Aug 18, 2025, 2:00 PM

#

AI-generated Answer for baymurat1

Under Multi-voice support, instability usually comes from configuration rather than the raw sliders. Here’s a focused checklist that aligns with the official docs:

Verify markup and labels
- Use the exact voice labels you configured; labels are case‑sensitive and nested tags aren’t supported. If a label isn’t recognized, the agent falls back to the default voice, which can look like “instability.” (elevenlabs.io)
Configure each voice used by Multi‑voice
- Voice settings are per voice ID. Update stability, similarity_boost, use_speaker_boost, style, and speed for every voice you’ve added to the agent (not just the default voice). You can do this in the web app (sliders map 1:1) or via the API endpoint POST /v1/voices/:voice_id/settings/edit. Defaults are stability=1, similarity_boost=1, use_speaker_boost=true, style=0, speed=1. (elevenlabs.io)
Keep model families consistent across voices
- In the Multi‑voice configuration, set the same Model family (Flash/Turbo/Multilingual) for all voices when possible. This is a documented best practice for smoother switching. (elevenlabs.io)
Use language overrides when switching languages
- If any of the voices are intended for another language, set a Language override on that voice to reduce pronunciation/accent drift. (elevenlabs.io)
Tune settings for consistency vs. expression
- For more consistent delivery, use higher Stability; for more expressiveness, lower it. The Conversational Voice Design guide recommends ~0.60–0.85 for consistent output (maxing at 1.0 can sound monotone). Adjust Similarity upward for a more uniform timbre, but extremely high values can introduce artifacts. (elevenlabs.io)
Stay within supported speed range
- Speed outside 0.7–1.2 isn’t supported and extreme values may affect quality; 1.0 is the default. (elevenlabs.io)
Give the LLM clear rules for switching
- In the Multi‑voice screen, add a descriptive “when to use this voice” note per voice; the platform injects these instructions into the system prompt to guide correct tagging. (elevenlabs.io)
Optional: lock down tricky terms
- If “instability” is actually inconsistent pronunciations (names, brands), add a Pronunciation Dictionary; phoneme entries work with Turbo v2, aliases work with all models. (elevenlabs.io)

Where to change this on the website

Agent > Voice > Multi‑voice support: add voices, set Model family, Language, and descriptions. (elevenlabs.io)
Voice settings (per voice) via web sliders or API; the API docs note the slider mapping (“Stability,” “Clarity + Similarity Enhancement,” “Speaker Boost,” “Style,” “Speed”). (elevenlabs.io)

If you still see variation after applying identical settings to every configured voice and using correct labels, share an example response showing the XML tags the agent produced—unrecognized or malformed tags are the most common cause of unexpected voice changes. (elevenlabs.io)

unreal peak Aug 19, 2025, 1:26 AM

#

I wonder how the voice setting work in multy voice support?

#hello