#what is inactivity timeout in eleven labs
1 messages · Page 1 of 1 (latest)
In ElevenLabs, inactivity_timeout is a WebSocket query parameter for the Text-to-Speech streaming endpoints that controls how long a connection (and its contexts) may remain idle before the server closes it.
- Default: 20 seconds. You can raise it up to 180 seconds by adding inactivity_timeout=<seconds> to the TTS WebSocket URL (e.g., stream-input or multi-stream-input). (elevenlabs.io)
- Keeping the socket alive: if you need longer idle periods, either increase inactivity_timeout or periodically send a keepâalive character; the Help Center recommends sending a single space " " (not an empty string, which signals EOS and closes the socket). (help.elevenlabs.io)
References:
- API reference (multi-stream-input): inactivity_timeout defaults to 20s; max 180s. (elevenlabs.io)
- Blog announcement: custom inactivity timeout support; max 180s. (elevenlabs.io)
- Help Center: guidance on keeping the WebSocket open and the 20s default. (help.elevenlabs.io)
Sources:
Multi-Context WebSocket | ElevenLabs Documentation
How can I keep the WebSocket open? â ElevenLabs
Multi-Context WebSocket | ElevenLabs Documentation
WebSocket improvements: reliability & custom timeout | ElevenLabs
How can I keep the WebSocket open? â ElevenLabs
is it the time when user doesn't speak or LLM doesn't respond?
Neither. âInactivity timeoutâ is a WebSocket-level idle timer: if no messages are sent on the socket for the configured period, the connection/context is closed. Itâs independent of whether the user is speaking or the LLM is responding. Default is 20 seconds, configurable up to 180 seconds via the inactivity_timeout query parameter; you can keep the socket alive by periodically sending a keepalive (e.g., a single space). (help.elevenlabs.io)
Note: If youâre using the Agents Platform, âTurn Timeoutâ is separateâit controls how long the assistant waits during user silence before prompting again. (elevenlabs.io)
Sources:
How can I keep the WebSocket open? â ElevenLabs
Conversation flow | ElevenLabs Documentation
what is the name of the timeout in agents platform which control how long eleven labs wait till a response is obtained from the LLM
I canât find any documented, user-configurable âLLM response timeoutâ setting in ElevenLabs Agents. Timeouts at the LLM step are handled internally (failures such as API errors, timeouts, or empty responses trigger LLM Cascading), but no knob is exposed to set an LLM wait duration. (elevenlabs.io)
Closest documented timeouts you can control:
- Turn Timeout: how long the agent waits for user input during silence. (elevenlabs.io)
- Response timeout (seconds) on tools: how long the agent waits for a client/webhook tool to return. (elevenlabs.io)
- WebSocket inactivity_timeout: connection-level keep-alive (not LLM wait), up to 180 seconds. (elevenlabs.io)
If you need to tolerate slower models, ElevenLabs suggests buffering strategies when integrating a custom LLM rather than configuring a specific LLM timeout. (elevenlabs.io)
Sources:
LLM Cascading | ElevenLabs Documentation
Conversation flow | ElevenLabs Documentation
Data Collection and Analysis with Agents Platform in Next.js | ElevenLabs Documentation
WebSocket improvements: reliability & custom timeout | ElevenLabs
Integrate your own model | ElevenLabs Documentation
I encountered the following error in the agents platform "This conversation failed with the following reason:
Generating the LLM response took over 20 seconds"
Why does this happen. How to fix this?
Short answer
- The error means the LLM step timed out. In Agents, timeouts from the selected LLM are treated as a failed turn and can trigger fallback/cascade logic. (elevenlabs.io)
How to fix or mitigate
- Use a faster, lowâlatency model for voice/chat. The LLM guide recommends speedâoptimized models (e.g., Gemini Flash series, Claude Haiku, GPTâ4oâmini) when realâtime responses are needed. (elevenlabs.io)
- Enable LLM cascading so the platform automatically falls back to a backup model if your primary model times out or fails. (elevenlabs.io)
- If you bring your own/custom LLM and itâs slow, stream partial output (e.g., âbuffer wordsâ) so speech can start immediately while the rest is generated. (elevenlabs.io)
- If youâre managing your own TTS WebSocket during the conversation, prevent context closures while waiting on the LLM by increasing inactivity_timeout (default 20s, up to 180s) or sending keepâalives. (elevenlabs.io)
- Be aware that enabling RAG adds a small (~500 ms) latency per response; if youâre close to timing out, evaluate whether itâs needed for that turn. (elevenlabs.io)
Where to confirm the failure reason
- Check the Agents call/conversation history; failed calls show the exact reason in red (e.g., timeout). (help.elevenlabs.io)
Note
- âTurn Timeoutâ in Conversation Flow controls how long the agent waits during user silence; it is not the LLMâresponse timeout. (elevenlabs.io)
If you need, share which LLM youâre using and whether you have cascading enabledâI can point to the exact setting in your agentâs configuration docs.
Sources:
LLM Cascading | ElevenLabs Documentation
Large Language Models (LLMs) | ElevenLabs Documentation
LLM Cascading | ElevenLabs Documentation
Integrate your own model | ElevenLabs Documentation
WebSocket | ElevenLabs Documentation
Retrieval-Augmented Generation | ElevenLabs Documentation
Where can I locate the reason for my call failing? â ElevenLabs
Conversation flow | ElevenLabs Documentation
So ou mean the LLM timed out.
Okay. Fair.
How to increase the timeout duration instead of the default value of 20
Short answer: thereâs no documented setting in ElevenLabs Agents to increase the LLM-generation timeout beyond the default that triggers that âGenerating the LLM response took over 20 secondsâ error. The docs suggest handling slow/unstable LLMs via fallbacks or streaming rather than extending that limit.
What you can change today
- WebSocket inactivity timeout (not the LLM step): if youâre using the TTS or multi-context WebSocket, set the inactivity_timeout query param up to 180 seconds, or send keepâalives. Default is 20s. Example: wss://âŠ/stream-input?âŠ&inactivity_timeout=180. (help.elevenlabs.io)
- Turn timeout (user silence): controls how long the agent waits in silence before prompting; configurable 1â30s in the agentâs Advanced settings. (elevenlabs.io)
- Tool timeouts: per-tool response_timeout_secs is configurable (default 20s). Increase it for slow webhooks/integrations. (elevenlabs.io)
Ways to avoid the LLM timeout
- Enable LLM Cascading so the platform automatically falls back to a faster backup model when the primary times out. (elevenlabs.io)
- Pick faster models for voice conversations (e.g., âFlashâ tiers) and keep prompts lean; RAG typically adds â500 ms per turn. (elevenlabs.io)
- If you use a Custom LLM, stream âbuffer wordsâ first (e.g., âLet me think about that⊠â) so audio starts while the model continues reasoning. (elevenlabs.io)
If you specifically need a longer LLM-generation timeout in Agents and canât meet latency targets with the above, there isnât a publicly documented knob for that today.
Sources:
How can I keep the WebSocket open? â ElevenLabs
Conversation flow | ElevenLabs Documentation
Create tool | ElevenLabs Documentation
LLM Cascading | ElevenLabs Documentation
Large Language Models (LLMs) | ElevenLabs Documentation
Integrate your own model | ElevenLabs Documentation
We are using a customLLM. What are the possible solutions