#HERMES_STREAM_STALE_TIMEOUT kills long responses agent goes silent, user waits forever

1 messages · Page 1 of 1 (latest)

potent heath
#

When an agent accumulates many tool calls in a single response (e.g., reading 18 files from a GitHub repo, browsing multiple web pages), the model's final text generation can take several minutes. During this phase, no new stream chunks are sent, and after 180 seconds of inactivity, HERMES_STREAM_STALE_TIMEOUT (default, hardcoded in run_agent.py line 4430) kills the connection.

The user sees no progress update — the agent appears frozen. On Telegram, the message simply never arrives.

Current env vars:

  • HERMES_STREAM_READ_TIMEOUT=350 (configurable via .env ✅)
  • HERMES_STREAM_STALE_TIMEOUT=180 (default, NOT configurable via config.yaml ❌, only env var)
  • HERMES_API_TIMEOUT=1800 (default)

Problems:

  1. HERMES_STREAM_STALE_TIMEOUT is not exposed in config.yaml or hermes setup — users don't know it exists
  2. Default of 180s is too aggressive for complex tasks (reading many files, large codebases, multi-step research)
  3. No heartbeat/progress mechanism during text generation — the agent can only report progress via tool calls, not during the final generation phase
  4. .env file is protected by the security scanner, so even the agent can't modify it without terminal workarounds

Suggested fixes:

  • Expose HERMES_STREAM_STALE_TIMEOUT in config.yaml under a sensible section
  • Increase default to 300-600s
  • Add a gateway-level heartbeat that sends a "typing..." or progress indicator to messaging platforms during long generations (like Telegram's "typing" action)

Affected: gateway/Telegram streaming, any platform where the user can't see partial responses. Less impactful on CLI where the user sees the spinner.

nimble swallow
potent heath
#

Because we previously fixed the exact same symptom tasks getting killed with no response during generation by increasing HERMES_STREAM_READ_TIMEOUT from 60s to 350s. That fixed it for simple tasks. Complex tasks with many tool calls still hit the same wall, so it's likely the same family of timeout causing the kill, just a different one (likely HERMES_STREAM_STALE_TIMEOUT at 180s, which isn't exposed in config.yaml)

#

My current configuration is:

HERMES_API_TIMEOUT=1800s

and that solved the problem.

#

But this only solves the problem that when there's no response within 60 seconds, the task is terminated. The message was:

⏳ Retrying in 2s (attempt 1/3)...

⏳ Retrying in 4s (attempt 2/3)...

⚠️ Max retries (3) exhausted — trying fallback...

nimble swallow
#

when was the last time you updated?

potent heath
nimble swallow
#

run hermes update

#

we pushed some fixes yesterday for this specific issue

potent heath
#

Ty sr

#

I do it

potent heath
#

Hermes versión 0.7.0 4/3?

gritty scroll
nimble swallow
#

no use the terminal and type hermes update in it

nimble swallow
potent heath
nimble swallow
#

no it is not a version it is the latest github version

#

just type hermes update in your command line that runs hermes agent

potent heath
#

Ok

gritty scroll
#

And now
Error: Error code: 402 - {'error': {'message': 'This request requires more credits, or fewer max_tokens. You requested up to 65535 tokens, but can only afford 8029. To increase, visit

#

I using another model

nimble swallow
#

not that the ctx len is too high

gritty scroll
gritty scroll
nimble swallow
#

auto points to gemini flash on openrouter

#

it does compaction and vision

#

I'll see if I can fix so it uses openai model for these if you switch to codex

potent heath
#

i think already!

gritty scroll
#

@nimble swallow what happened lmao

#

Still openrouter provider

nimble swallow
gritty scroll
#

and here it still gives an error that I'm still using openrouter lmao

#

I also used the OpenAI Codex provider earlier, but the output still gave an error saying that OpenRouter had no balance.

#

Walawe I'm being chased by openrouter lmao

nimble swallow
#

There is a PR being merged that will stop that but right now you'd need to set your auxiliary models to your main

#

in the config.yaml

potent heath
#

@nimble swallow

#

This update broke several things haha

#

😅

#

⚕ Updating Hermes Agent...

→ Fetching updates...
→ Local changes detected — stashing before update...
Saved working directory and index state On main: hermes-update-autostash-20260406-184247
→ Found 7 new commit(s)
→ Pulling updates...

⚠ Local changes were stashed before updating.
Restoring them may reapply local customizations onto the updated codebase.
Review the result afterward if Hermes behaves unexpectedly.
Restore local changes now? [Y/n]
y
→ Restoring local changes...
⚠ Local changes were restored on top of the updated codebase.
Review git diff / git status if Hermes behaves unexpectedly.
✓ Cleared 13 stale pycache directories
→ Updating Python dependencies...
→ Updating Node.js dependencies...
✅ Browser tools ready. Run: python run_agent.py --help

✓ Code updated!

→ Syncing bundled skills...
↑ 1 updated: manim-video
~ 2 user-modified (kept)

→ Syncing bundled skills to other profiles...
clanky: ↑1 updated

→ Checking configuration for new options...

Would you like to configure them now? [Y/n]: y

Config version: 10 → 12

✓ Update complete!

✓ Restarted hermes-gateway

nimble swallow
#

You made some changes locally which may be breaking things however

gritty scroll
nimble swallow