OpenRouter | Friends of the Crustacean 🦞🤝 | Page 1

green spire · 2026-03-19T18:40:40.976Z

- i used Alpha Hunter for a week on open router, the dashboard says that i used 115M tokens which is impossible, i cannot verify with them because there's no support, any hint bout this?

green spire Mar 19, 2026, 6:40 PM

#

i used Alpha Hunter for a week on open router, the dashboard says that i used 115M tokens which is impossible, i cannot verify with them because there's no support, any hint bout this?

#

i wanted to show screenshots

scenic groveBOT Mar 19, 2026, 6:42 PM

#

115M/week is high, but sadly not impossible — usually it’s one of these:

Long context replay (client keeps sending full chat history each turn)
Reasoning-heavy responses (extra internal tokens)
Retries/background loops (same request sent many times)
Shared/leaked API key (another app/process using the same key)

Quick sanity check:
115M / 7 ≈ 16.4M tokens/day.
If each request is ~200k input tokens, that’s only ~80 requests/day — this can happen faster than people expect.

If you’re using OpenClaw, do this to verify where it’s coming from:

/usage full (per-response token footer)
openclaw status --usage (provider snapshot)
and send openclaw --version here, I can help you narrow it down.

Docs for token accounting:

#

Perfect — please share them here and I’ll help you audit it.

Best screenshots to post (redact API keys):

OpenRouter Usage page (date range = last 7 days)
Per-model breakdown (if available)
Request/activity logs showing largest calls
Your client settings for max tokens / reasoning / retries (if you have them)

Once I see those, I can usually tell within a minute whether it’s context replay, retries, or key leakage.

green spire Mar 19, 2026, 6:44 PM

#

gonna send you evertthing just notice that I say it's impossible because i don't even use it that much, 2hr average per day in 6-7 days..

scenic groveBOT Mar 19, 2026, 6:44 PM

#

Yeah, with ~2h/day that number does sound off for “normal” usage.
Usually when it jumps like that, it’s either:

a hidden retry/runaway loop, or
another app using the same API key.

Do this now before sharing screenshots:

Rotate the OpenRouter key
Create a new key only for this one app/client
Set a hard spend limit

Then send whatever screenshots you have and I’ll help pinpoint it.
If you can, include:

usage graph by day
biggest request rows
model breakdown
your client’s retry/max-token settings

#

We’ll figure out where those tokens went.

green spire Mar 19, 2026, 6:46 PM

#

I cannot send screenshot Because there Is no button

scenic groveBOT Mar 19, 2026, 6:47 PM

#

No worries — text is enough 👍

Just paste these values here (no screenshot needed):

Date range
Total requests
Total input tokens
Total output tokens
Top 3 biggest requests (tokens each)
Any retry/error count (429/5xx if shown)
Whether this key was used in more than one app

If easier, copy this and fill it:

Range:

#

Requests:
Input tokens:
Output tokens:
Largest req #1:
Largest req #2:
Largest req #3:
429/5xx:
Other apps using same key: yes/no

And if you haven’t yet, rotate the key first so usage can’t keep climbing.

green spire Mar 19, 2026, 7:02 PM

#

no other apps using the key, stopped working now because the model is unavaiable anymore and it's just one model.
tokens: Model
Min (tok)
Max (tok)
Avg (tok)
Sum (tok)

Hunter Alpha 909K 38.9M 14.4M 115M
requests:>

Model
Min (req)
Max (req)
Avg (req)
Sum (req)

Hunter Alpha 6 556 236 1.88K

scenic groveBOT Mar 19, 2026, 7:03 PM

#

This is super useful — and it actually explains a lot.

Your numbers are internally consistent:

115M tokens / 1.88K requests ≈ 61K tokens per request
Peak bucket: 38.9M tokens with 556 requests (≈70K/request)

So this is probably not a dashboard math bug.
It looks like a request storm + large context per call (most likely retries/looping in the client, especially if the model started failing/unavailable).

Given your “~2h/day” usage, 1.88K requests is way too high for manual chatting.

What to do now

Keep using a fresh rotated key
In your client, set:

max retries = 0 or 1
disable auto-continue/background agents

#

lower max context / clear chat history often

Do a 10-minute test with the new key:

if request count increases while idle → definite background loop

If you tell me which client/app you used, I can point to the exact settings to stop this.

green spire Mar 19, 2026, 7:10 PM

#

so i am charging 10$ for the api from xiaomi themselves, the prices are this MiMo-V2-Pro
Model name mimo-v2-pro
Category Text Generation - General-purpose large language models
Context length 1 M
Max output 128 K
Features Text Generation、Deep Thinking、Stream output、Fuction Call、Json Output、Web Search
Price

0< Token ≤ 256 K：Input: $1.00 / 1M tokens, Cached Input: $0.20 / 1M tokens, Output: $3.00 / 1M tokens
256K < Token ≤ 1 M：Input: $2.00 / 1M tokens, Cached Input: $0.40 / 1M tokens, Output: $6.00 / 1M tokens

Note: Cache writing is free for a limited time.
Rate limit RPM: 100
TPM: 10 M
i am going to just use the api for openclaw i wont use it that much considering the time there are heartbeat every 30m where i don't think nothing needs to happen, i mean no tool call .. let's seee in a few hrs and will see if it's coherent

scenic groveBOT Mar 19, 2026, 7:11 PM

#

Yep — good plan, and one important thing:

Heartbeat still runs a full model turn even if no tool is called.
So it can burn tokens quietly.

If you want usage to stay low in OpenClaw, set heartbeat like this:

{
agents: {
defaults: {
heartbeat: {
every: "2h", // or "0m" to disable
target: "none", // default, no outbound message
isolatedSession: true, // big token saver
lightContext: true // only HEARTBEAT.md bootstrap

#

}
}
}
}

Also keep HEARTBEAT.md tiny.

Docs:

If you want, paste your heartbeat block and I’ll sanity-check it before you run.

green spire Mar 20, 2026, 12:39 AM

#

Solo questi messaggi mi sono costati 1.1M di token. Sono 13 messaggi precisi 7-6 non ha senso così

scenic groveBOT Mar 20, 2026, 12:41 AM

#

Hai ragione, è frustrante — ma i numeri che vedi sono plausibili se il client reinvia tutto il contesto ad ogni turno.

1.1M / 13 messaggi = ~84k token per messaggio
Questo succede quando sommi:

prompt di sistema grande
cronologia completa del thread
eventuali tool/result
reasoning/output lunghi

Quindi non è “13 messaggi semplici”, è 13 volte un payload molto più grande.

Per verificarlo subito:

/context detail (ti dice cosa pesa davvero)
/usage tokens o /usage full
/status

Per ridurre subito i costi:

#

apri una sessione nuova più spesso
usa /compact periodicamente
abbassa max output/reasoning
se usi heartbeat: isolatedSession: true + lightContext: true (o disattivalo)

Se vuoi, incollami l’output di /context detail e ti dico esattamente dove stanno andando i token.

green spire Mar 20, 2026, 12:47 AM

#

Ma se sono plausibili vuol dire che non è sostenibile per nessuno come costo per nessuno. Ed è un open source

scenic groveBOT Mar 20, 2026, 12:48 AM

#

Hai perfettamente ragione sul principio.
Se resta così com’è, non è sostenibile per uso “normale”.

La parte importante è questa: in OpenClaw il default è orientato a capacità, non a costo minimo.
Con thread lunghi + contesto 1M + heartbeat, i token possono esplodere.

Quindi va “messo in modalità risparmio”:

Heartbeat off (o molto più raro)
Sessioni più corte (/new) e /compact
Ridurre bootstrap/context

Esempio config low-cost (valida):

#

agents: {
defaults: {
bootstrapMaxChars: 6000,
bootstrapTotalMaxChars: 30000,
heartbeat: {
every: "0m" // disabilitato
}
}
}
}

Se vuoi tenere heartbeat:

heartbeat: { every: "2h", isolatedSession: true, lightContext: true, target: "none" }

#

Docs:

Se mi incolli il tuo blocco agents.defaults, te lo ottimizzo io per stare su costi sensati.

OpenClaw

Token Use and Costs - OpenClaw

OpenClaw

Heartbeat - OpenClaw

OpenClaw

Context - OpenClaw

green spire Mar 20, 2026, 1:02 AM

#

Bootstrap limits
• max/file: 20,000 chars
• max/total: 150,000 chars

Sandbox
• mode=off
• sandboxed=false

⸻

System prompt
• Run: 21,671 chars (~5,418 tok)
• Project Context: 8,138 chars (~2,035 tok)

⸻

Injected workspace files
• AGENTS.md → 771 chars (~193 tok)
• SOUL.md → 1,717 chars (~430 tok)
• TOOLS.md → 700 chars (~175 tok)
• IDENTITY.md → 654 chars (~164 tok)
• USER.md → 821 chars (~206 tok)
• HEARTBEAT.md → 686 chars (~172 tok)
• BOOTSTRAP.md → MISSING
• MEMORY.md → 2,047 chars (~512 tok)

⸻

Skills
• Total: 3,023 chars (~756 tok)
• Count: 6

Lista:
• gemini
• healthcheck
• node-connect
• skill-creator
• video-frames
• weather

Top skills (per peso):
• skill-creator → 756 chars (~189 tok)
• node-connect → 538 chars (~135 tok)
• healthcheck → 488 chars (~122 tok)
• weather → 413 chars (~104 tok)
• video-frames → 226 chars (~57 tok)
• gemini → 218 chars (~55 tok

#

🧠 Context breakdown (Part 2)

Tools

Tool list (testo prompt):
• 1,993 chars (~499 tok)

Tool schemas (JSON):
• 15,974 chars (~3,994 tok)

⸻

Tools disponibili

read, edit, write, exec, process, browser, canvas, message, tts,
agents_list, sessions_list, sessions_history, sessions_send,
sessions_yield, sessions_spawn, subagents, session_status,
web_search, web_fetch, memory_search, memory_get

⸻

Top tools (dimensione schema)
• message → 5,025 chars (~1,257 tok)
• browser → 2,799 chars (~700 tok)
• sessions_spawn → 1,179 chars (~295 tok)
• exec → 1,086 chars (~272 tok)
• web_search → 1,084 chars (~271 tok)
• process → 961 chars (~241 tok)
• canvas → 661 chars (~166 tok)
• edit → 591 chars (~148 tok)
• read → 392 chars (~98 tok)
• web_fetch → 374 chars (~94 tok)
• write → 313 chars (~79 tok)

⸻

Top tools (summary size)
• browser → 1,683 chars (~421 tok)
• memory_search → 334 chars (~84 tok)
• read → 298 chars (~75 tok)
• session_status → 207 chars (~52 tok)
• sessions_spawn → 198 chars (~50 tok)
• exec → 181 chars (~46 tok)

⸻

Tools (numero parametri)
• message → 94
• browser → 48
• canvas → 18
• sessions_spawn → 17
• exec → 12
• process → 12
• web_search → 9
• edit → 6
• sessions_send → 5
• read → 4

⸻

Session
• Cached tokens: 39,763
• Context max: 200,000

⸻

Inline shortcut

Comando inline dentro il testo (es. hey /status)
• eseguito immediatamente
• solo per sender allowlisted
• rimosso prima che il modello processi il resto

scenic groveBOT Mar 20, 2026, 1:04 AM

#

Perfetto, questi dati chiariscono molto.
Qui il bootstrap non è il colpevole principale.

Dal tuo dump:

base system+files+skills ≈ ~10k token/turno (ordine di grandezza)
su 13 messaggi sarebbero ~130k token
ma hai speso 1.1M ⇒ restano ~970k da spiegare

Quindi il grosso viene quasi certamente da:

cronologia chat che cresce ad ogni turno, e/o
deep thinking/reasoning del modello (token spesso non visibili nel testo finale), e/o
output troppo lunghi.

Cosa fare subito

Usa /usage full per 2-3 messaggi e guarda input vs output per turno
Fai /new più spesso (topic separati)
Usa /compact prima che il thread diventi lungo

#

Nel client/modello, abbassa o disattiva deep thinking e limita max output tokens

Se mi incolli 2 righe reali di /usage full (input/output), ti dico esattamente quale dei 3 sta pesando di più.

#

Ecco, questo chiude il cerchio 👇

Il dato chiave è:

Session cached tokens: 39,763

Significa che ogni turno parte già con ~40k token di input (anche se scrivi una riga).

Quindi:

13 messaggi × ~40k input ≈ 520k input
se aggiungi output/deep-thinking (es. 20–50k a turno), arrivi facilmente a ~1.1M totale

Quindi sì: è “plausibile”, e il problema è soprattutto contesto accumulato + output reasoning, non bootstrap/tools da soli.

Azione immediata

fai /compact adesso
o meglio /new quando cambi argomento

#

tieni deep thinking più basso/disattivo
limita output max
heartbeat solo con isolatedSession: true + lightContext: true (o off)

Se vuoi, nel prossimo messaggio mandami solo una riga di /usage full (input/output) e ti dico il settaggio preciso per stare in budget.

green spire Mar 20, 2026, 1:11 AM

#

What!? 💸 Usage cost
Session $0.0000 · 868k tokens
Today $0.0000
Last 30d $0.0000

scenic groveBOT Mar 20, 2026, 1:11 AM

#

Yep — that can happen, and it’s confusing 😅

$0.0000 there is OpenClaw’s local estimate, not the provider invoice.
It shows zero when:

pricing for that model/provider isn’t configured in OpenClaw, or
auth mode hides dollar estimates (OAuth flows), or
provider pricing is dynamic and OpenClaw only has token counts.

So this is possible:

868k tokens (real)
$0.0000 (local estimate unavailable)

For actual billing, trust the provider dashboard.
Docs: https://docs.openclaw.ai/reference/token-use

OpenClaw

Token Use and Costs - OpenClaw

green spire Mar 20, 2026, 1:26 AM

#

33 request 1.2 M token spent. That’s crazy wrong

scenic groveBOT Mar 20, 2026, 1:27 AM

#

It feels wrong, but with your own stats it matches almost exactly:

You had ~39,763 cached context tokens per request
39,763 × 33 requests = 1,312,179 input tokens baseline

So ~1.2M for 33 requests is unfortunately expected, even before big outputs.

So the fix is not billing support — it’s reducing per-request context:

/new now (hard reset context)
/compact regularly
shorter threads per topic
lower reasoning/deep-thinking + max output
heartbeat off, or isolated/light

Your token burn is context inflation, not random overcharging.

#OpenRouter

What to do now

Cosa fare subito

Azione immediata