what is inactivity timeout in eleven labs | ElevenLabs | Page 1

grand jungleBOT Oct 17, 2025, 7:50 AM

#

AI-generated Answer for kishore_vb

In ElevenLabs, inactivity_timeout is a WebSocket query parameter for the Text-to-Speech streaming endpoints that controls how long a connection (and its contexts) may remain idle before the server closes it.

Default: 20 seconds. You can raise it up to 180 seconds by adding inactivity_timeout=<seconds> to the TTS WebSocket URL (e.g., stream-input or multi-stream-input). (elevenlabs.io)
Keeping the socket alive: if you need longer idle periods, either increase inactivity_timeout or periodically send a keep‑alive character; the Help Center recommends sending a single space " " (not an empty string, which signals EOS and closes the socket). (help.elevenlabs.io)

References:

API reference (multi-stream-input): inactivity_timeout defaults to 20s; max 180s. (elevenlabs.io)
Blog announcement: custom inactivity timeout support; max 180s. (elevenlabs.io)
Help Center: guidance on keeping the WebSocket open and the 20s default. (help.elevenlabs.io)

Sources:
Multi-Context WebSocket | ElevenLabs Documentation
How can I keep the WebSocket open? – ElevenLabs
Multi-Context WebSocket | ElevenLabs Documentation
WebSocket improvements: reliability & custom timeout | ElevenLabs
How can I keep the WebSocket open? – ElevenLabs

terse ibex Oct 17, 2025, 7:52 AM

#

is it the time when user doesn't speak or LLM doesn't respond?

grand jungleBOT Oct 17, 2025, 7:52 AM

#

AI-generated Answer for kishore_vb

Neither. “Inactivity timeout” is a WebSocket-level idle timer: if no messages are sent on the socket for the configured period, the connection/context is closed. It’s independent of whether the user is speaking or the LLM is responding. Default is 20 seconds, configurable up to 180 seconds via the inactivity_timeout query parameter; you can keep the socket alive by periodically sending a keepalive (e.g., a single space). (help.elevenlabs.io)

Note: If you’re using the Agents Platform, “Turn Timeout” is separate—it controls how long the assistant waits during user silence before prompting again. (elevenlabs.io)

Sources:
How can I keep the WebSocket open? – ElevenLabs
Conversation flow | ElevenLabs Documentation

terse ibex Oct 17, 2025, 7:54 AM

#

what is the name of the timeout in agents platform which control how long eleven labs wait till a response is obtained from the LLM

grand jungleBOT Oct 17, 2025, 7:54 AM

#

AI-generated Answer for kishore_vb

I can’t find any documented, user-configurable “LLM response timeout” setting in ElevenLabs Agents. Timeouts at the LLM step are handled internally (failures such as API errors, timeouts, or empty responses trigger LLM Cascading), but no knob is exposed to set an LLM wait duration. (elevenlabs.io)

Closest documented timeouts you can control:

Turn Timeout: how long the agent waits for user input during silence. (elevenlabs.io)
Response timeout (seconds) on tools: how long the agent waits for a client/webhook tool to return. (elevenlabs.io)
WebSocket inactivity_timeout: connection-level keep-alive (not LLM wait), up to 180 seconds. (elevenlabs.io)

If you need to tolerate slower models, ElevenLabs suggests buffering strategies when integrating a custom LLM rather than configuring a specific LLM timeout. (elevenlabs.io)

Sources:
LLM Cascading | ElevenLabs Documentation
Conversation flow | ElevenLabs Documentation
Data Collection and Analysis with Agents Platform in Next.js | ElevenLabs Documentation
WebSocket improvements: reliability & custom timeout | ElevenLabs
Integrate your own model | ElevenLabs Documentation

terse ibex Oct 17, 2025, 8:01 AM

#

I encountered the following error in the agents platform "This conversation failed with the following reason:
Generating the LLM response took over 20 seconds"

Why does this happen. How to fix this?

grand jungleBOT Oct 17, 2025, 8:01 AM

#

AI-generated Answer for kishore_vb

Short answer

The error means the LLM step timed out. In Agents, timeouts from the selected LLM are treated as a failed turn and can trigger fallback/cascade logic. (elevenlabs.io)

How to fix or mitigate

Use a faster, low‑latency model for voice/chat. The LLM guide recommends speed‑optimized models (e.g., Gemini Flash series, Claude Haiku, GPT‑4o‑mini) when real‑time responses are needed. (elevenlabs.io)
Enable LLM cascading so the platform automatically falls back to a backup model if your primary model times out or fails. (elevenlabs.io)
If you bring your own/custom LLM and it’s slow, stream partial output (e.g., “buffer words”) so speech can start immediately while the rest is generated. (elevenlabs.io)
If you’re managing your own TTS WebSocket during the conversation, prevent context closures while waiting on the LLM by increasing inactivity_timeout (default 20s, up to 180s) or sending keep‑alives. (elevenlabs.io)
Be aware that enabling RAG adds a small (~500 ms) latency per response; if you’re close to timing out, evaluate whether it’s needed for that turn. (elevenlabs.io)

Where to confirm the failure reason

Check the Agents call/conversation history; failed calls show the exact reason in red (e.g., timeout). (help.elevenlabs.io)

Note

“Turn Timeout” in Conversation Flow controls how long the agent waits during user silence; it is not the LLM‑response timeout. (elevenlabs.io)

If you need, share which LLM you’re using and whether you have cascading enabled—I can point to the exact setting in your agent’s configuration docs.

Sources:
LLM Cascading | ElevenLabs Documentation
Large Language Models (LLMs) | ElevenLabs Documentation
LLM Cascading | ElevenLabs Documentation
Integrate your own model | ElevenLabs Documentation
WebSocket | ElevenLabs Documentation
Retrieval-Augmented Generation | ElevenLabs Documentation
Where can I locate the reason for my call failing? – ElevenLabs
Conversation flow | ElevenLabs Documentation

terse ibex Oct 17, 2025, 8:04 AM

#

So ou mean the LLM timed out.
Okay. Fair.

How to increase the timeout duration instead of the default value of 20

grand jungleBOT Oct 17, 2025, 8:04 AM

#

AI-generated Answer for kishore_vb

Short answer: there’s no documented setting in ElevenLabs Agents to increase the LLM-generation timeout beyond the default that triggers that “Generating the LLM response took over 20 seconds” error. The docs suggest handling slow/unstable LLMs via fallbacks or streaming rather than extending that limit.

What you can change today

WebSocket inactivity timeout (not the LLM step): if you’re using the TTS or multi-context WebSocket, set the inactivity_timeout query param up to 180 seconds, or send keep‑alives. Default is 20s. Example: wss://…/stream-input?…&inactivity_timeout=180. (help.elevenlabs.io)
Turn timeout (user silence): controls how long the agent waits in silence before prompting; configurable 1–30s in the agent’s Advanced settings. (elevenlabs.io)
Tool timeouts: per-tool response_timeout_secs is configurable (default 20s). Increase it for slow webhooks/integrations. (elevenlabs.io)

Ways to avoid the LLM timeout

Enable LLM Cascading so the platform automatically falls back to a faster backup model when the primary times out. (elevenlabs.io)
Pick faster models for voice conversations (e.g., “Flash” tiers) and keep prompts lean; RAG typically adds ≈500 ms per turn. (elevenlabs.io)
If you use a Custom LLM, stream “buffer words” first (e.g., “Let me think about that… ”) so audio starts while the model continues reasoning. (elevenlabs.io)

If you specifically need a longer LLM-generation timeout in Agents and can’t meet latency targets with the above, there isn’t a publicly documented knob for that today.

Sources:
How can I keep the WebSocket open? – ElevenLabs
Conversation flow | ElevenLabs Documentation
Create tool | ElevenLabs Documentation
LLM Cascading | ElevenLabs Documentation
Large Language Models (LLMs) | ElevenLabs Documentation
Integrate your own model | ElevenLabs Documentation

terse ibex Oct 17, 2025, 8:08 AM

#

We are using a customLLM. What are the possible solutions

#what is inactivity timeout in eleven labs