#Hi, I’ve been building Conversational AI agents in Italian on your platform for months, and I love

1 messages · Page 1 of 1 (latest)

gaunt vesselBOT
#
AI-generated Answer for ricasco

Short answer: the issues you’re seeing are known pain points when using low‑latency multilingual TTS/STT in Conversational AI. ElevenLabs provides a few documented mitigations; below are only the ones we can verify in the official docs.

What’s happening and what you can do

  • Email addresses get mis-captured

    • Why: emails and other alphanumeric strings are hard for STT and for TTS to repeat naturally. This is expected with compact, low‑latency models. ElevenLabs recommends normalizing or spelling such items explicitly. (elevenlabs.io)
    • Workarounds:
      • In your agent prompt, instruct the LLM to spell emails in words when speaking them back (e.g., “nome punto cognome chiocciola dominio punto com”) and to read URLs/descriptors explicitly; this is called out in the Conversational AI prompting guide. (elevenlabs.io)
      • Use Data Collection to extract an email with a strict format description (“user@domain.com”), examples of valid/invalid values, and validation notes; this improves how the LLM returns structured data even if the spoken repetition is imperfect. (elevenlabs.io)
      • If you transcribe outside Conversational AI via the Speech‑to‑Text API, set language_code=“ita” (or the appropriate ISO code) to reduce misrecognition. (elevenlabs.io)
  • Occasional “invented language” or language drift

    • Why: multilingual agents can switch languages and accents. ElevenLabs documents language detection and multi‑language behavior, and notes that voice choice matters. (elevenlabs.io, help.elevenlabs.io)
    • Workarounds:
      • Lock the conversation to Italian: set the agent’s primary language to Italian and avoid adding Additional Languages (language selection is fixed for the call), or programmatically set the conversation language at start via overrides. (elevenlabs.io)
      • If you do need multilingual behavior, enable the Language Detection system tool and define supported languages explicitly; this controls when switching is allowed. (elevenlabs.io)
      • Use an Italian-native voice for the best pronunciation and reduced accent drift. (elevenlabs.io, help.elevenlabs.io)
  • Numbers repeated strangely when the agent speaks them back

    • Why: smaller, low‑latency models read numeric items less naturally (e.g., Flash v2.5 vs larger Multilingual models). ElevenLabs’ normalization guide explains these differences and how to mitigate them. (elevenlabs.io)
    • Workarounds:
      • Prompt-level normalization: instruct the LLM to convert numbers into a speakable form (spell out digits; add grouping pauses for phone numbers; expand dates/currencies). The docs provide concrete formats and example prompts. (elevenlabs.io)
      • API normalization: use apply_text_normalization where possible. Note that for the streaming/real‑time TTS endpoints you cannot force “on” for eleven_turbo_v2_5
foggy thunder
#

is there a suggested model to achieve these results? Actually I used only OpenAI 4.1 mini and OpenAI 4o