#our calls are being ruiend because of background noise
1 messages · Page 1 of 1 (latest)
Sorry you’re dealing with that. ElevenLabs offers two reliable ways to remove background noise, depending on whether your audio is live or recorded:
-
Live calls with an ElevenLabs agent: Use the Conversational AI WebRTC path, which provides built‑in echo cancellation and background‑noise removal designed for real‑time conversations. If you’re on HTTP streaming today, switching to WebRTC typically improves call clarity. (elevenlabs.io)
-
Recorded calls (post‑call cleanup): Use Voice Isolator in the app or via API to strip background noise and return clean speech. It supports audio/video files up to 500 MB or 1 hour and is billed at 1,000 credits per minute. Supported formats include MP3, WAV, M4A, OGG/OPUS (audio) and MP4/MOV/MKV, etc. (help.elevenlabs.io)
Implementation tips:
- App workflow: Upload or record, then click “Isolate voice” to process and download the cleaned file. (help.elevenlabs.io)
- API: The Audio Isolation (Voice Isolator) API is available and supports streaming for lower‑latency processing. See the quickstart and streaming docs. (elevenlabs.io)
- If you’re using Voice Changer in your pipeline, enable remove_background_noise=true to clean the input as it’s converted. (elevenlabs.io)
Would you like help setting up WebRTC for your live agent, or instructions to batch‑clean recorded calls with Voice Isolator? (elevenlabs.io)
Sources:
ElevenLabs Conversational AI now supports WebRTC
What is Voice Isolator? – ElevenLabs
What is Voice Isolator? – ElevenLabs
Voice Isolator quickstart | ElevenLabs Documentation
Voice changer | ElevenLabs Documentation
ElevenLabs Conversational AI now supports WebRTC
In this settings, normal/eager/etc ... this is what defines now the VAD since there is NO toggle button for disable background noise(like it was in previous elevenlabs versions?)
Short answer: No. “Eager / Normal / Patient” controls turn‑taking (how quickly the agent jumps in), not VAD or background‑noise suppression. See “Turn eagerness” in Conversation flow. (elevenlabs.io)
What you can control today (per docs):
- VAD for realtime STT: choose commit_strategy=vad or commit_strategy=manual and tune VAD thresholds (vad_silence_threshold_secs, vad_threshold, min_speech_duration_ms, min_silence_duration_ms) on the Scribe Realtime WebSocket API. This governs how speech segments are detected, which helps with noisy environments. (elevenlabs.io)
- Background‑noise removal (separate from VAD):
- WebRTC transport in Conversational AI adds built‑in echo cancellation and background‑noise removal at the client layer. (elevenlabs.io)
- Voice Changer API has remove_background_noise (boolean) for processing uploaded audio; this flag is specific to Voice Changer, not live agent calls. (elevenlabs.io)
- Voice Isolator (app/API) removes background noise from files before or after calls, not during live streaming. (elevenlabs.io)
I couldn’t find a documented “disable background noise” UI toggle in the current Agents dashboard. If you share whether you’re using the Agents UI or the Scribe Realtime API, I can point to the exact place to set commit_strategy or recommend VAD parameters. (elevenlabs.io)
Sources:
Conversation flow | ElevenLabs Documentation
Realtime | ElevenLabs Documentation
ElevenLabs Conversational AI now supports WebRTC
Voice changer | ElevenLabs Documentation
Voice isolator | ElevenLabs Documentation
Conversation flow | ElevenLabs Documentation
I make a http request to agent, then it fetch the initiation data where I override the system prompt, first message, voice settings etc... but about VAD , I don't know
Short answer: in ElevenLabs Agents there is no documented server-side toggle or override for VAD in the conversation initiation metadata. The “Eager/Normal/Patient” setting you see controls turn-taking responsiveness, not VAD. You can change turn eagerness in the dashboard or via API, but it doesn’t adjust VAD thresholds.
What you can do today
- Read VAD activity: subscribe to the vad_score client event. This gives a continuous 0–1 score so you can build your own logic for when to treat input as speech vs. noise.
- Client-side VAD controls (if you embed our UI/SDK): the Speech Input component exposes VAD knobs such as vadThreshold, vadSilenceThresholdSecs, minSpeechDurationMs, and minSilenceDurationMs, plus default microphone constraints echoCancellation and noiseSuppression for background noise. Use these when you control the client.
- STT model behavior: Scribe v2 Realtime (used under the hood) includes built-in VAD and also supports “manual commit” if you prefer to decide when to finalize segments instead of relying on VAD.
Practical guidance for noisy environments
- If you embed the web client, keep echoCancellation and noiseSuppression enabled (defaults in Speech Input), and tune vadThreshold/vadSilenceThresholdSecs until false positives drop.
- Use vad_score to gate user turns (e.g., only accept speech when the score stays above a threshold for N ms).
- Remember: “Eager/Normal/Patient” only affects how quickly the agent takes its turn; it’s separate from VAD/noise handling.
Optional (post-call cleanup)
- For recorded audio (not live calls), you can remove background noise with the Voice Isolator tool/API.
If you need specific API field names for setting turn eagerness or client VAD parameters in your stack, tell me which SDK/client you’re using and I’ll point to the exact snippet.
I use n8n to start the call as I said.
Thanks—that helps. If you’re triggering calls from n8n, you can call our telephony endpoints directly and pass your per‑call overrides in the same request.
Ways to start the call from n8n (HTTP Request node)
- If you manage telephony via SIP in ElevenLabs: POST /v1/convai/sip-trunk/outbound-call with headers xi-api-key and JSON body { agent_id, agent_phone_number_id, to_number, conversation_initiation_client_data? }. This immediately places the call. (elevenlabs.io)
- If you keep Twilio as the telephony host: POST /v1/convai/twilio/register-call to get TwiML you return to Twilio (works for inbound or outbound flows you orchestrate). Ensure your agent uses ÎĽ-law 8000 Hz for both input and output when using Twilio. (elevenlabs.io)
- For large campaigns, you can also trigger Batch Calling from the dashboard or API. (elevenlabs.io)
Passing your system prompt / first message / voice per call
- Put them under conversation_initiation_client_data.conversation_config_override (e.g., agent.prompt.prompt, agent.first_message, agent.language, tts.voice_id). Enable those fields in the agent’s Security tab if you plan to override them. (elevenlabs.io)
About VAD and background noise
- There’s no user‑facing “disable background noise” toggle in Agents. Turn eagerness (Eager/Normal/Patient) controls how quickly the agent takes the floor; it does not disable VAD. Configure it in Agent settings or via API. (elevenlabs.io)
- VAD is part of the turn‑taking stack and surfaces as vad_score events (0–1 probability). There’s no documented API to tune its threshold. (elevenlabs.io)
- To reduce false triggers from background noise on calls:
- Set turn eagerness to Patient and use longer turn timeouts; you can also raise the initial_wait_time (supported via recent turn config overrides). (elevenlabs.io)
- Optionally add the Skip Turn system tool so the agent can explicitly pause and wait quietly. (elevenlabs.io)
- If your client is web/app (not PSTN), use our WebRTC path for built‑in echo cancellation and background noise removal. (elevenlabs.io)
If you tell me whether your n8
If i set eagerness to Patient and use linger turn timeout that mean s higher latency!!!!