#What’s the best way to capture email addresses correctly. As the agent currently has issues with em
1 messages · Page 1 of 1 (latest)
Short answer:
- Treat emails as structured data. In your agent’s system prompt, add explicit “character normalization” rules and examples (e.g., user says “max punkt mueller at firma punkt de” → agent converts to “max.mueller@firma.de”). ElevenLabs’ prompting guide recommends normalizing spoken vs. written forms for identifiers like emails. (elevenlabs.io)
- Use Scribe v2 Speech-to-Text with entity detection to extract email_address from transcripts. This adds a machine-verified parse of the email alongside the raw text. (elevenlabs.io)
- Ensure German is the recognition language. Either set language_code="deu" in STT (recommended for voice agents targeting German), or enable the Language detection system tool to auto-switch. Scribe v2 supports 90+ languages including German. (elevenlabs.io)
- If you control a web or mobile surface, prefer typed capture: have the agent trigger a Client Tool to open a form/input field for the email, then pass the typed value back to the conversation context. This avoids ASR errors entirely. (elevenlabs.io)
- Bias recognition for your known brand/domain terms with Scribe v2 Keyterm Prompting (e.g., “@firma.de”, product names). This improves accuracy for uncommon tokens in the address. (elevenlabs.io)
- For chat-only or mixed experiences where accuracy is critical, use Chat Mode for that step (text-only message exchange) and resume voice after confirmation. (elevenlabs.io)
Implementation notes:
- System prompt snippet (aligns with ElevenLabs guidance on normalization):
- “When collecting an email, ask for it in spoken form, then convert to written form before using tools. Normalization: ‘at’→‘@’, ‘punkt/dot’→‘.’, remove spaces; handle German spellings/umlauts as provided by the user. Read back the spoken form (‘max punkt mueller at firma punkt de’) for confirmation, then submit the written form.” (elevenlabs.io)
- STT setup:
- Batch/Realtime STT: set model_id="scribe_v2"; set language_code="deu" to reduce misclassification; enable entity_detection=["pii"] or ["email_address"] to return a structured email entity. (elevenlabs.io)
Why this works:
- Normalization + confirmation addresses symbol/pronunciation issues (“@”, “.”, umlaut spellings) in German, as recommended for structured data. (elevenlabs.io)
- Entity detection gives you a machine-extracted email field you can trust more than free-form text. (elevenlabs.io)
- Fixing the ASR language to German or enabling language detection prevents cross-language drift. (elevenlabs.io)
- Client Tools or Chat Mode eliminate ASR for the high-precision step when possible. (elevenlabs.io)
References:
- Transcription (Scribe v2 features, supported languages) (elevenlabs.io)
- STT Quickstart (language_code parameter) (elevenlabs.io)
- Entity detection (
But those are more speech to text I am using German as default language for our agents
You’re right—this is about Agents, not generic STT. Here’s how to get reliable email capture when your Agents’ default language is German:
- Configure the agent’s primary language to German (and optionally add others). When you add multiple languages you can enable automatic language switching; otherwise it will stay in the primary language. Also pick a German-appropriate voice for natural pronunciation. (elevenlabs.io)
- Under the hood, Agents use ElevenLabs’ Scribe for transcription. For live calls, use Scribe v2 Realtime (150 ms latency, 90+ languages including German) which is built for agentic use cases and is available in Agents. (elevenlabs.io)
- Add explicit data extraction for email in your Agent configuration. In Data Collection, define an email field and specify “extract the customer’s email address in standard format (user@domain.com).” This ensures the LLM returns a normalized email value you can trust. (elevenlabs.io)
- Add character normalization rules to your system prompt so the agent reliably converts spoken inputs to written email addresses (e.g., “john dot smith at company dot com” → “john.smith@company.com”). The Prompting Guide shows exactly how to specify this and when to pass the normalized string to tools. (elevenlabs.io)
- If you control a small set of domains (e.g., your company’s), bias the recognizer toward them with Scribe v2 Keyterm Prompting (up to 100 terms). This helps with brand and domain spellings in German conversations. (elevenlabs.io)
- For cases where voice capture proves brittle (names, rare spellings), collect the email via Chat Mode. It’s designed for precise inputs like order IDs and email addresses and can be enabled alongside voice. (elevenlabs.io)
- If callers may switch languages mid-call (German/English), Agents can auto-detect and switch; enable this only if you want that behavior. (elevenlabs.io)
These settings keep the agent in German by default, normalize what it hears into a valid email string, and add biasing where it matters most. If you want, I can draft the exact Data Collection schema and a prompt snippet you can paste into your agent. (elevenlabs.io)
Sources:
Language | ElevenLabs Documentation
Introducing Scribe v2 Realtime
Data Collection | ElevenLabs Documentation
Prompting guide | ElevenLabs Documentation
Keyterm prompting | ElevenLabs Documentation
ElevenLabs — ElevenLabs Agents now support Chat Mode | ElevenLabs
Deploy Conversational AI Agents in Minutes, Not Months
Data Collection | ElevenLabs Documentation
But would the entity detection also help? As we would do tool calling during the call I suppose not