if im having a conversation, and my token amount is limited, how can i let the agent know about my | ElevenLabs | Page 1

unique vectorBOT Apr 12, 2026, 3:17 AM

#

AI-generated Answer for zqilk

Short answer: Pass a “budget” value into the agent at runtime, keep it updated via a tool, and instruct the agent to gracefully end using the End Call system tool when the threshold is reached.

How to do it with what ElevenLabs documents today:

Provide the budget to the agent at session start with Dynamic Variables. Add something like {{token_budget}} or {{remaining_tokens}} to your system prompt and pass the value at runtime. Dynamic Variables are explicitly supported in system prompts, first messages, and tool parameters. (elevenlabs.io)
Keep the budget current mid-conversation with a tool:
- Create a Server Tool (webhook) such as check_remaining_budget that returns JSON with the latest remaining_tokens. Mark what to extract into dynamic variables using Dynamic Variable assignment so the value is written back into the conversation context. (elevenlabs.io)
- If you host the UI, you can also use a Client Tool and enable “Wait for response” so the agent receives the returned value before continuing. (elevenlabs.io)
End naturally (not abruptly) with the End Call system tool:
- Ensure the End Call tool is enabled (auto-added for dashboard agents; add it manually for API/SDK-created agents). You can supply custom prompt rules like “When {{remaining_tokens}} is below N, summarize, confirm there’s nothing else, then end.” The tool also supports an optional “message” argument for a brief farewell before hang-up. (elevenlabs.io)
Reduce how many tokens the model consumes while you talk:
- In Model configuration, lower the Thinking budget and keep Reasoning effort set to None for conversational turns to limit internal reasoning tokens and keep responses concise. (elevenlabs.io)
Plan ahead if helpful: before starting, estimate likely LLM token usage for your agent with the Calculate expected LLM usage API (accepts prompt_length, number_of_pages, rag_enabled). Use that to set the budget you inject via Dynamic Variables. (elevenlabs.io)

These pieces let the agent know the current budget and end the conversation gracefully based on your own threshold, rather than timing out unexpectedly. (elevenlabs.io)

hollow forge Apr 12, 2026, 3:22 AM

#

do i also instrcut the agent to constantly use the tool to retreive the token? or can it automatically sense that and send it ton the agent

unique vectorBOT Apr 12, 2026, 3:22 AM

#

AI-generated Answer for zqilk

Short answer: there’s no automatic “token/balance sensing.” You must provide that signal to the agent.

What to do instead

Pass the balance at conversation start via dynamic variables or overrides, so the LLM can factor it into decisions. These are designed for injecting per-conversation data like “remaining budget” into prompts and tool params. (elevenlabs.io)
Or expose a server tool that fetches the latest balance, then instruct the agent (in its system prompt) when to call it (e.g., at start, before costly actions, or when nearing a threshold). Server tools are how agents call external APIs. If you want the agent to wait for the tool result, mark the tool “blocking” in the UI. (elevenlabs.io)

Where to fetch usage/balance

ElevenLabs exposes a “Get user subscription” endpoint that returns character_count/character_limit; wrap this behind your server tool (or your own backend) and let the agent call it. (elevenlabs.io)

Ending gracefully when the budget is low

Enable the End Call system tool and tell the agent when to invoke it (e.g., “If remaining credits < X, summarize and end politely”). Dashboard-created agents include it by default; API-created ones must add it manually. (elevenlabs.io)
As a fallback, you can rely on max conversation duration timeouts (e.g., in WhatsApp channel the conversation ends on timeout if the End Call tool isn’t used). On April 1, 2026, a max_conversation_duration_message field was added so the agent can send a final message when that timeout hits. (elevenlabs.io)

Bottom line

Don’t “constantly poll” unless your use case needs it. Provide the balance once via dynamic variables/overrides and/or expose a balance-check tool the agent calls at logical checkpoints; then use End Call to wrap up naturally. (elevenlabs.io)

hollow forge Apr 12, 2026, 3:50 AM

#

so like every minute i will update the dynamic variable?

unique vectorBOT Apr 12, 2026, 3:50 AM

#

AI-generated Answer for zqilk

Short answer: you don’t need to update it every minute. ElevenAgents won’t “sense” your external token budget—you must provide it—but you should update only when the value actually changes or at key thresholds, not on a fixed 60‑second cadence. The docs recommend grouping contextual updates and using reasonable intervals to avoid spam. (elevenlabs.io)

Recommended pattern

Initialize: Pass your budget as a dynamic variable at conversation start via conversation_initiation_client_data.dynamic_variables. (elevenlabs.io)
Update during the call (structured): Return the latest budget from a server tool or client tool and map fields into dynamic variables via Assignments. Client tool responses can also be assigned to dynamic variables. This is the supported way to mutate dynamic variables mid‑conversation. (elevenlabs.io)
Update during the call (lightweight): If you don’t need the value inside prompts/tools/expressions, you can push a “contextual_update” event like “Remaining budget: 1200 tokens.” This does not set a dynamic variable; it just adds background context. (elevenlabs.io)
Drive a graceful ending: Reference your budget dynamic variable in workflow expressions to route to a closing node when the budget drops below a threshold. You can also use system variables (e.g., system__call_duration_secs) that auto‑update. (elevenlabs.io)

Cadence guidance

No required polling rate. Send an update when the budget changes or on coarse checkpoints; avoid minute‑by‑minute polling unless your meter truly changes that often. Docs call out grouping contextual updates and using reasonable intervals to limit traffic. (elevenlabs.io)

hollow forge Apr 12, 2026, 3:55 AM

#

when u start a session, does it teturn the session id?

unique vectorBOT Apr 12, 2026, 3:55 AM

#

AI-generated Answer for zqilk

Yes—on session start you get an identifier, but the field name depends on the API:

Realtime Speech-to-Text (WebSocket): the server immediately sends a session_started message that includes session_id. (elevenlabs.io)
ElevenAgents conversations: the server’s first client event includes conversation_initiation_metadata with conversation_id; SDKs also expose it (e.g., Kotlin onConnect callback and session.getId()). (elevenlabs.io)

If you tell me which stack you’re using (Agents WebSocket/WebRTC vs STT Realtime), I can show the exact payload and where to read the ID.

Sources:
Realtime | ElevenLabs Documentation
Client events | ElevenLabs Documentation

hollow forge Apr 12, 2026, 3:59 AM

#

is the session.getId() same as the conversation id use to get the conversation detail?

unique vectorBOT Apr 12, 2026, 3:59 AM

#

AI-generated Answer for zqilk

Yes. In the JS SDK, conversation.getId() returns the conversation_id for the active session, and that is the same ID you pass to the Conversations API (GET /v1/convai/conversations/:conversation_id) to fetch conversation details. (elevenlabs.io)

For reference, the conversation_id is also emitted by the WebSocket initiation metadata right after a session starts. (elevenlabs.io)

Sources:
JavaScript SDK | ElevenLabs Documentation
Agent WebSockets | ElevenLabs Documentation

hollow forge Apr 12, 2026, 4:07 AM

#

can u prompt the agent to speak more naturaly? or its some configuration on rhe platform?

unique vectorBOT Apr 12, 2026, 4:07 AM

#

AI-generated Answer for zqilk

Short answer: both. You can steer “naturalness” with prompting and with agent/voice configuration on the platform.

What to use

System prompt: Describe tone and delivery (e.g., calm/reassuring when the user sounds frustrated). Prompting controls behavior/style, while mechanics like turn‑taking are handled by platform settings. (elevenlabs.io)
Expressive Mode: In your agent, select the “V3 Conversational” TTS model to enable Expressive Mode; then guide emotional delivery via natural‑language instructions in the system prompt. You can also tune turn‑taking with the turn eagerness setting. (elevenlabs.io)
Conversation flow settings: Configure interruptions, soft/turn timeouts, and turn eagerness to make responses feel better‑timed and less robotic. (elevenlabs.io)
Voice settings: Adjust Stability, Similarity, and Speed in the Agent Voice settings; these directly affect perceived naturalness (more dynamic vs. more consistent). (elevenlabs.io)
Pronunciation dictionary and voice features: For names/terms, add dictionary entries; also control speed and other voice options per agent. (elevenlabs.io)
Per‑conversation overrides: If needed, override the system prompt, first message, LLM, or TTS settings at conversation start (or prefer Dynamic Variables for ongoing personalization). (elevenlabs.io)

In practice: write clear tone guidance in the system prompt, enable V3 Conversational (Expressive Mode), then fine‑tune conversation‑flow and voice settings until the cadence feels right. (elevenlabs.io)

hollow forge Apr 12, 2026, 5:25 AM

#

how to get the duration of the session

unique vectorBOT Apr 12, 2026, 5:25 AM

#

AI-generated Answer for zqilk

Short answer: use the Conversations API (or the agent’s system dynamic variables).

Inside the agent (live during a call): reference the system dynamic variable system__call_duration_secs. System variables are auto-updated throughout the conversation. (elevenlabs.io)
From your backend for a single session: GET /v1/convai/conversations/:conversation_id and read metadata.call_duration_secs. (elevenlabs.io)
For many sessions at once (and filtering): GET /v1/convai/conversations; each item includes call_duration_secs, and you can filter with call_duration_min_secs / call_duration_max_secs. (elevenlabs.io)

Notes: ElevenLabs uses the term “conversation” in the API; this corresponds to what you’re calling a “session.” (elevenlabs.io)

Sources:
Dynamic variables | ElevenLabs Documentation
Get conversation details | ElevenLabs Documentation
List conversations | ElevenLabs Documentation
Get conversation details | ElevenLabs Documentation

hollow forge Apr 12, 2026, 6:20 AM

#

how could i configure tool using api, like add client tool

unique vectorBOT Apr 12, 2026, 6:20 AM

#

AI-generated Answer for zqilk

Here’s the minimal, API-only flow to add a client tool and wire it to an agent:

Create a client tool

POST /v1/convai/tools with tool_config.type="client". Set name/description; set expects_response true if the agent should pause and wait for your client handler’s return value.
Example:
curl -X POST https://api.elevenlabs.io/v1/convai/tools
-H "Content-Type: application/json"
-H "xi-api-key: YOUR_API_KEY"
-d '{
"tool_config": {
"type": "client",
"name": "logMessage",
"description": "Log a message on the client",
"expects_response": false,
"parameters": {
"type": "object",
"properties": { "message": { "type": "string", "description": "Text to log" } },
"required": ["message"]
}
}
}'
This returns an id you’ll reference when attaching the tool to the agent.

Register the client tool handler in your app
Client tools must be registered in code so the runtime can execute them when the agent calls the tool. Example (Python):
from elevenlabs import ElevenLabs
from elevenlabs.conversational_ai.conversation import Conversation, ClientTools

def log_message(parameters):
print(parameters.get("message"))

client_tools = ClientTools()
client_tools.register("logMessage", log_message)

conversation = Conversation(
client=ElevenLabs(api_key="YOUR_API_KEY"),
agent_id="your-agent-id",
client_tools=client_tools,
)
conversation.start_session()
Names are case-sensitive and must match the tool’s name and parameter schema you created. If you want the agent to wait for the handler’s output, set expects_response true in the tool config (“Wait for response” in UI terms).

Attach the tool to your agent
Update the agent to include the tool’s ID under conversation_config.agent.prompt.tool_ids. Do not send the legacy prompt.tools array.

Example:
curl -X PATCH https://api.elevenlabs.io/v1/convai/agents/AGENT_ID
-H "Content-Type: application/json"
-H "xi-api-key: YOUR_API_KEY"
-d '{
"conversation_config": {
"agent": {
"prompt": {
"tool_ids": ["TOOL_ID"]
}
}
}
}'
System tools (e.g., end_call) are referenced separately via prompt.built_in_tools, not tool_ids.

Verify or manage tools

List tools to find IDs or confirm creation: GET /v1/convai/tools.
Fetch a single tool: GET /v1/convai/tools/:tool_id.
Update or delete if needed: PATCH/DELETE /v1/convai/tools/:tool_id.

Notes

expects_response true makes the agent wait and injects your handler’s return value into conversation context.
After July 23, 2025, you must use prompt.tool_ids; mixing both prompt.tools and prompt.tool_ids in a single request causes an error.

hollow forge Apr 12, 2026, 6:26 AM

#

what about for certain agent can u list rthe tools it has?

#if im having a conversation, and my token amount is limited, how can i let the agent know about my