How can I handle accents and non-native speakers with my voice agents? This causes unexpected quirk | ElevenLabs | Page 1

AI-generated Answer for flenshhh0524

Short answer:

Use Scribe v2 Realtime for the live transcript, and when the user finishes a turn, optionally recheck that segment with Scribe v2 + keyterm prompting to bias domain terms (e.g., “tire,” “lug nut,” “jack”) and fix homophones like “change a tire” vs “change attire.” Keyterm prompting supports up to 100 terms and is available on Scribe v2 batch transcription.
Pass conversation context to the realtime recognizer using previous_text so the ASR can leverage what’s already been said (helps resolve ambiguous words in-context).
If you know the caller’s language, set language_code in the realtime STT handshake to reduce misdetection with strong accents; also tune VAD for non‑native speakers by adjusting vad_silence_threshold_secs, vad_threshold, and min_*_duration to avoid premature commits or cut‑offs.
Detect low‑confidence transcripts and ask the user to confirm: the realtime API returns word‑level logprob in committed_transcript_with_timestamps; you can use this signal to trigger a polite clarification. The Agents quickstart also recommends clarifying unclear requests in the system prompt.
Let the agent adapt to non‑native speakers and code‑switching: enable the Language detection system tool and configure multiple languages for your agent so it can automatically switch output language based on the user’s speech.
Choose Scribe v2 Realtime for accuracy “across diverse accents,” with automatic language detection and text conditioning; use it inside ElevenLabs Agents.

Why this helps:

Keyterm prompting (batch Scribe v2) biases toward domain‑relevant words without forcing them when context doesn’t fit—ideal for homophones in task‑specific agents.
Realtime context (previous_text), language hints, and tuned VAD improve turn integrity and reduce accent‑related errors during live calls.
Programmatic clarification plus agent prompt guidance keeps the conversation correct when the ASR is uncertain.

Notes:

Keyterm prompting is batch-only (Scribe v2), so use it as a quick post‑turn verification step alongside your realtime stream if needed.
Scribe v2/Realtime support 90+ languages and are designed for real‑time agent scenarios.

#How can I handle accents and non-native speakers with my voice agents? This causes unexpected quirk