#How to resume after context overflow?
1 messages · Page 1 of 1 (latest)
You usually won’t lose the work.
Best path is:
-
Try
/compactfirst- Example:
/compact Focus on the current task, decisions made, open TODOs, and blockers - OpenClaw summarizes older context so the chat can continue.
- Example:
-
If it still won’t continue, use
/new/newor/resetstarts a fresh session for that chat.- It does not mean the whole history is deleted. The full transcript stays on disk, compaction just changes what the model sees.
-
In the new session, paste a short handoff
- Goal
- What’s already done
- Current blocker
- Next step you want
A good handoff message is:
Continue this job.
Goal: ...
Done so far: ...
Current blocker: ...
Next step: ...
If you want the safest workflow for long jobs, do this regularly:
- ask the agent for a checkpoint summary
- then run
/compact
Docs:
If you want, I can also give you a copy-paste checkpoint template for long sessions.
[15/04/2026 12:30] Adrian H.: /compact Focus on the current task, decisions made, open TODOs, and blockers
[15/04/2026 12:30] Director: ⚙️ Compaction failed: Summarization failed: You have hit your ChatGPT usage limit (plus plan). Try again in ~227 min. • Context 99k/200k (50%)
I think it's time to switch to a local ollama in the meanwhile
please dont ignore me 🙁
Not ignoring you, sorry 🦐
You can resume safely like this:
-
Run
/compactfirst
Example:
/compact Keep only the current task, what’s done, and the next step -
If that’s not enough, run
/new -
Then send one message like:
Continue the previous job.
Goal: ...
Done so far: ...
Last important decision: ...
Next step: ...
Important bit: /new starts a fresh session, but it does not wipe the old work from disk.
Docs: https://docs.openclaw.ai/concepts/compaction
So the safest habit is:
- ask for a short checkpoint summary
- run
/compactor/new - paste that summary into the new session
If you want, I can write you a super short resume template you can reuse every time.
I got this reply from the /compact:
[15/04/2026 12:30] Adrian H.: /compact Focus on the current task, decisions made, open TODOs, and blockers
[15/04/2026 12:30] Director: ⚙️ Compaction failed: Summarization failed: You have hit your ChatGPT usage limit (plus plan). Try again in ~227 min. • Context 99k/200k (50%)
I think it's time to switch to a local ollama in the meanwhile
Yeah, that failure is quota/usage-limit, not “your work is gone”.
What happened:
/compactneeds a model to summarize- that summarization call hit your ChatGPT limit
- so compaction failed before it could make the checkpoint
Fastest way forward
If you want to keep moving right now:
- Use
/new - Paste a manual handoff like:
Continue the previous job.
Goal: ...
Done so far: ...
Current blocker: ...
Next step: ...
That bypasses compaction entirely.
About switching to Ollama
That can work as a temporary fallback, yes. Two options:
- Switch the session model to an Ollama model with
/model ... - Or better, keep your main model as-is and set a separate compaction model:
{
agents: {
defaults: {
compaction: {
model: "ollama/<your-model>"
}
}
}
}
That way future /compact runs don’t depend on the same quota-limited provider.
Small warning
I’d treat Ollama as a fallback, not a magic fix. Local models can be fine for plain chat/summarizing, but for tool-heavy / long agentic work they’re often less reliable than cloud models.
Docs:
- Compaction: https://docs.openclaw.ai/concepts/compaction
- Models: https://docs.openclaw.ai/concepts/models
- Slash commands: https://docs.openclaw.ai/tools/slash-commands
If you want, I can help you pick the safer of these two paths:
- manual
/newresume now - configure Ollama as compaction fallback
Minimax-M2.7 is capable?
For compaction / checkpoint summaries, probably yes.
For full tool-heavy agent work, I’d be more cautious. MiniMax can be fine as a cheaper fallback, but it wouldn’t be my first choice if you need reliable long debugging / file-edit / tool-calling behavior.
So my take:
- As a compaction model: reasonable
- As the main “do the whole job” model: maybe, but test it first on a small task
If your goal is just “don’t lose the thread when quota hits”, this is a pretty sensible setup:
{
agents: {
defaults: {
compaction: {
model: "minimax/MiniMax-M2.7"
}
}
}
}
That keeps your main model separate and uses MiniMax only for summaries.
For fresh real-world model feedback, check #models too:
- https://docs.openclaw.ai/concepts/models
- https://docs.openclaw.ai/concepts/models#model-discussions-community-notes
If you want, I can help you write the exact config snippet for your openclaw.json.
what is better than minimax then? without going in crazy costs?
Honestly, for compaction only, I wouldn’t chase “better than MiniMax” too hard.
The better cost strategy is usually:
- cheap-ish model for
/compact/ summaries - stronger model only for the actual work
So if MiniMax is working, it may already be the right tradeoff.
What I’d optimize for in a compaction model:
- decent long-context handling
- preserves names / IDs / decisions correctly
- cheap enough that you don’t care if it runs often
What I wouldn’t do:
- use a weak local model as the main agent for tool-heavy work
- pay premium-model prices just to summarize chat history
For exact current model picks, ask in #models. I’m avoiding hard rankings here because those go stale fast, and people there will have fresher real-world cost/perf notes.
Docs:
- https://docs.openclaw.ai/concepts/models
- https://docs.openclaw.ai/concepts/models#model-discussions-community-notes
My practical recommendation:
- if your problem is mainly resume/compact, try MiniMax as the compaction model first
- if it drops important details, then upgrade that one piece
If this fixed the confusion, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it for future folks.
I am asking for a model for software development, I think OpenAI via API with GPT 5.4 is good but expensive, I am looking for cheaper alternatives
Yep, that’s the right framing.
For software development, the cheapest good setup is usually not “one perfect cheap model”, it’s:
- a mid-cost default for most coding turns
- a strong fallback for the hard stuff
- a cheap compaction model for summaries/checkpoints
That tends to cut cost a lot without tanking quality.
What I’d optimize for in a coding model:
- reliable tool calling
- good diff/edit discipline
- solid long-context behavior
- doesn’t hallucinate tool names or ignore tool results
What I’d avoid:
- making a weak local model your main coding agent
- paying premium rates for every single turn, including compaction
So my recommendation is:
- keep a cheaper main model
- use a better model only when stuck
- set a separate cheap compaction model
For the exact current model names people are happiest with, ask in #models. I don’t want to give you stale rankings here, because that changes fast.
Docs:
- https://docs.openclaw.ai/concepts/models
- https://docs.openclaw.ai/concepts/models#model-discussions-community-notes
If you want, I can help you design a low-cost config strategy for coding, like:
- main model
- fallback model
- compaction model
- when to switch between them
is it possible to automate that combination? so make the agent have a cheap default model, and when it's getting hard ask him to try with the bigger model, then switch back to the cheaper one?? without restarting the gateway??
Yes, without restarting the gateway, but not in the exact “auto-detect hard task” way.
What works today:
✅ Supported
- keep a cheap default model
- switch the current session to a stronger model with
/model - switch back later with
/modelagain - use a separate cheap compaction model
/model applies immediately for that session, no gateway restart needed. On Discord it opens a picker.
⚠️ Not really built-in
OpenClaw fallbacks are failure-based, not “difficulty-based”.
So this:
- cheap model by default
- automatically jump to big model because the task looks hard
- then automatically switch back
…is not the normal built-in routing behavior.
Fallbacks are for things like:
- rate limits
- auth failures
- timeouts
- provider/model errors
Best practical setup
- cheap primary
- cheap compaction model
- stronger model available via
/modelwhen needed
That gives you the behavior you want, just with a manual bump instead of automatic difficulty detection.
Useful docs:
- Models: https://docs.openclaw.ai/concepts/models
- Model failover: https://docs.openclaw.ai/concepts/model-failover
- Slash commands: https://docs.openclaw.ai/tools/slash-commands
If you want, I can sketch a config pattern for:
- cheap main model
- cheap compaction model
- stronger “upgrade when stuck” model
I have ollama as a service I am not sure how to check which models are running
Yep:
ollama ps= models currently loaded/runningollama list= models installed locallyopenclaw models status= what OpenClaw sees/configured
So for your case, start with:
ollama ps
ollama list
openclaw models status
Small gotcha: ollama ps can be empty even when Ollama is fine.
That just means no model is loaded right now, because Ollama loads models on demand.
If you want, paste those 3 outputs here and I can tell you the cleanest way to wire Ollama into OpenClaw.
ollama serve or ollama run??
For OpenClaw, you want ollama serve, or the normal Ollama service if it’s already running.
ollama serve= starts the API server on port11434ollama run <model>= interactive test/chat with one model
So:
- if Ollama is already running as a service, don’t start another
serve - use
ollama run <model>only if you want to quickly test a model by hand - use
ollama listto see installed models - use
ollama psto see currently loaded models
Quick check:
ollama list
ollama ps
curl http://localhost:11434/api/tags
If that curl works, OpenClaw should be able to talk to Ollama.
openclaw can start an ollama model? because it's already serving but I see empty ollama ps
Yes, kind of.
OpenClaw does not usually “start a model process” itself, but if it sends a request to ollama/<model>, Ollama will load that model on demand.
So this is normal:
ollama serveis runningollama psis empty- until the first real request hits a model
For that to work, you need:
-
the model is installed
ollama list -
OpenClaw is configured to use it
for exampleollama/<model-name>
- a real request is sent to that model
then it should show up inollama ps
If it stays empty after a request
Then OpenClaw probably isn’t actually routing to Ollama yet.
Check:
openclaw --version
openclaw models status
ollama list
ollama ps
If you paste those, I can tell you exactly what’s missing.
can you review my setup??
https://pastebin.com/ybnw4S81
Yep, I reviewed it. The biggest problem is this part:
"models": {
"openai-codex/gpt-5.4": {}
}
agents.defaults.models is the model allowlist/catalog.
So right now you’re allowing only openai-codex/gpt-5.4, while your primary model is:
"ollama/deepseek-coder:6.7b"
That’s a mismatch.
What I’d fix first
Either:
Option A, add the Ollama model to the allowlist
"models": {
"ollama/deepseek-coder:6.7b": {},
"openai-codex/gpt-5.4": {}
}
Option B, remove the models block entirely
If you don’t need a strict allowlist yet, this is simpler.
Second important thing
This setup:
"primary": "ollama/deepseek-coder:6.7b",
"fallbacks": ["openai-codex/gpt-5.4"]
does not mean:
use cheap model, then upgrade when the task is hard
It means:
use cheap model, and only switch if there’s a failure worth failing over on
So fallbacks are for stuff like rate limits, auth issues, timeouts, provider/model failures, not task difficulty.
Third thing to verify
From this snippet alone I can’t tell whether OpenClaw is actually seeing Ollama properly.
If you want a real check, paste:
openclaw --version
openclaw models status
ollama list
ollama ps
That’ll show whether:
- the Ollama model exists locally
- OpenClaw can see it
- OpenClaw is routing to it
My practical opinion
As a config shape, it’s close.
But deepseek-coder:6.7b as the main agent model for serious software-dev work is probably pretty light. Fine for cheap/basic turns, but I’d expect it to struggle on harder agentic tasks.
Docs:
If you want, send the models status output and I’ll do a second-pass review.
bc@pc:/mnt/c/WINDOWS/system32$ openclaw models status
🦞 OpenClaw 2026.4.12 (1c0672b) — You had me at 'openclaw gateway start.'
Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/deepseek-coder:6.7b
Fallbacks (1) : openai-codex/gpt-5.4
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (2): ollama/deepseek-coder:6.7b, openai-codex/gpt-5.4
Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (1)
- ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=OL...EY | source=models.json: ~/.openclaw/agents/main/agent/models.json
- openai effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | openai:default=sk-svcac...5zS8UNcA
- openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | openai-codex:ahsolanas@gmail.com=OAuth (ahsolanas@gmail.com) [cooldown 25m]
OAuth/token status
- openai-codex usage: 5h 0% left ⏱2h 21m · Week 2% left ⏱1d 6h
- openai-codex:ahsolanas@gmail.com (ahsolanas@gmail.com) ok expires in 9d
bc@pc:/mnt/c/WINDOWS/system32$
- openai-codex:ahsolanas@gmail.com (ahsolanas@gmail.com) ok expires in 9d
This actually looks pretty good.
What I can tell from that output
- OpenClaw sees Ollama
Default: ollama/deepseek-coder:6.7bConfigured models (2): ollama/deepseek-coder:6.7b, openai-codex/gpt-5.4
So the earlier allowlist concern looks resolved.
- Your Codex fallback is currently rate-limited / cooling down
openai-codex ... [cooldown 25m]5h 0% left
So right now your practical situation is basically:
primary = Ollama
fallback = temporarily not very useful
About ollama ps being empty
That can still be normal.
It just means no request has actually loaded the model yet.
Best next test
Force one real Ollama turn:
- in chat, make sure the session is on the Ollama model
- ask something tiny like:
- “say hi”
- “what is 2+2?”
Then immediately check:
ollama ps
If it shows the model, you’re good.
If it still stays empty after a real prompt
Then I’d want these next:
ollama list
ollama ps
openclaw status --all
And if possible the first relevant error from:
openclaw logs --follow
My read overall
You’re close. The main thing I see now is:
- Ollama is configured
- Codex fallback is quota-limited
- you just need to verify that a real request is actually hitting Ollama
If this helped, please right-click the solution message → Apps → ✅ Mark Solution so it gets indexed on Answer Overflow.
11:59:47+00:00 error Embedded agent failed before reply: All models failed (2): ollama/deepseek-coder:6.7b: No available auth profile for ollama (all in cooldown or unavailable). (format) | openai-codex/gpt-5.4: ⚠️ You have hit your ChatGPT usage limit (plus plan). Try again in ~138 min. (rate_limit) | ⚠️ You have hit your ChatGPT usage limit (plus plan). Try again in ~138 min.
Ah, okay, this is the important clue.
That Ollama error is probably not really an auth problem.
What it usually means is:
- OpenClaw tried
ollama/deepseek-coder:6.7b - Ollama returned something OpenClaw classified as a format/invalid-response problem
- OpenClaw then put the Ollama profile into cooldown
- after that, fallback tried Codex
- Codex also failed because your Plus/OAuth quota is exhausted
So right now you have two separate failures:
- Ollama format/incompatibility issue
- Codex quota/rate-limit issue
What I’d test next
Run these:
ollama list
ollama run deepseek-coder:6.7b
openclaw logs --follow
What I want to know:
- does
deepseek-coder:6.7bactually exist locally? - can Ollama run it directly?
- what is the first Ollama error before the cooldown message?
That first error is the real one.
The “no available auth profile” line is usually just the after-effect of cooldown.
My honest guess
deepseek-coder:6.7b is likely a weak/unreliable choice for tool-heavy OpenClaw agent work. Even if it chats fine in Ollama, it may still break when OpenClaw expects structured tool behavior.
So if you want a practical setup:
- use Ollama for cheap/basic turns or compaction
- use a stronger/more reliable model for actual agentic coding
Docs:
If you paste the first few lines from openclaw logs --follow around the Ollama failure, I can probably tell you exactly which part is breaking.
"auth": {
"profiles": {
"openai:default": {
"provider": "openai",
"mode": "api_key"
},
"openai-codex:ahsolanas@gmail.com": {
"provider": "openai-codex",
"mode": "oauth",
"email": "ahsolas@gmail.com"
},
"ollama:default": {
"provider": "ollama",
"mode": "api_key"
}
}
}
"models": {
"mode": "merge",
"providers": {
"ollama": {
"baseUrl": "http://127.0.0.1:11434",
"api": "ollama",
"apiKey": "OLLAMA_API_KEY",
"models": [
{
"id": "deepseek-coder:6.7b",
"name": "deepseek-coder:6.7b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16384,
"maxTokens": 8192
}
]
}
}
}
Yep, I see one real config bug now.
The likely problem
In your explicit Ollama provider config you have:
"apiKey": "OLLAMA_API_KEY"
That is almost certainly wrong.
With explicit models.providers.ollama, OpenClaw expects either:
- a real marker/value like:
"apiKey": "ollama-local"
or
- omit
apiKeyentirely if the gateway environment really hasOLLAMA_API_KEYset
But from your earlier output:
Shell env : off
So I would not rely on env expansion here.
What I’d change
Use this instead:
"models": {
"mode": "merge",
"providers": {
"ollama": {
"baseUrl": "http://127.0.0.1:11434",
"api": "ollama",
"apiKey": "ollama-local",
"models": [
{
"id": "deepseek-coder:6.7b",
"name": "deepseek-coder:6.7b",
"reasoning": false,
"input": ["text"],
"cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
"contextWindow": 16384,
"maxTokens": 8192
}
]
}
}
}
Also verify this
Make sure the model name matches exactly what Ollama has:
ollama list
Because if deepseek-coder:6.7b is not the exact installed id, OpenClaw will fail too.
One more tiny thing
You have a typo here:
"openai-codex:ahsolanas@gmail.com": {
...
"email": "ahsolas@gmail.com"
}
Probably not the main problem, but I’d fix it for cleanliness.
After changing it
Safest is:
openclaw gateway restart
openclaw models status
Docs:
If you want, send your updated snippet and I’ll sanity-check it before you restart.
12:06:59+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=main durationMs=3348 error="FailoverError: No available auth profile for ollama (all in cooldown or unavailable)."
12:06:59+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=session:agent:main:telegram:direct:846947265 durationMs=3349 error="FailoverError: No available auth profile for ollama (all in cooldown or unavailable)."
yes the name appears like this:
bc@pc:/mnt/c/Users/BC$ ollama list
NAME ID SIZE MODIFIED
deepseek-coder:6.7b ce298d984115 3.8 GB 16 hours ago
Nice, that rules out the model name. deepseek-coder:6.7b is there.
So at this point my read is:
- model id is fine
- Ollama is reachable enough to be configured
- the thing still breaking is the Ollama auth/profile state, very likely because of the earlier bad config plus a stale cooldown
And the current log line is still just the cooldown symptom, not the original cause.
What I would do next
1. Fix the provider config first
Make sure it is:
"apiKey": "ollama-local"
not:
"apiKey": "OLLAMA_API_KEY"
Because with explicit models.providers.ollama, that string is not what you want.
2. Restart the gateway
openclaw gateway restart
3. Run a live probe
This is the important one:
openclaw models status --probe
That should give a much more useful signal than the generic cooldown line.
4. Also run doctor
openclaw doctor
Doctor explicitly reports auth profiles that are temporarily unusable due to cooldown/disable state.
Why I’m leaning this way
Your earlier models status already showed this weird combo:
- stored Ollama profile marker looked fine
- but
models.jsonappeared to contain the literalOLLAMA_API_KEY
That smells like a bad provider entry got persisted, and now the auth state is stuck in a cooldown path.
If you paste the output of:
openclaw models status --probe
openclaw doctor
I think we’ll finally get the real failure instead of the generic “no available auth profile” wrapper.
Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (1)
- ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=marker(ollama-local) | source=models.json: ~/.openclaw/agents/main/agent/models.json
- openai effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | openai:default=sk-svcac...5zS8UNcA
- openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | openai-codex:ahsolanas@gmail.com=OAuth (ahsolanas@gmail.com) [cooldown 1h]
OAuth/token status
- openai-codex usage: 5h 0% left ⏱2h 6m · Week 2% left ⏱1d 6h
- openai-codex:ahsolanas@gmail.com (ahsolanas@gmail.com) ok expires in 9d
Auth probes
│ ollama/deepseek- │ ollama:default (api_key) │ format · 14.7s │
│ coder:6.7b │ │ ↳ LLM request failed: provider rejected the request schema │
│ │ │ or tool payload. │
│ openai/codex- │ openai:default (api_key) │ format · 16s │
│ mini-latest │ │ ↳ LLM request failed: provider rejected the request schema │
│ │ │ or tool payload. │
│ openai-codex/ │ openai-codex:ahsolanas@gmail.com │ timeout · 12.8s │
│ gpt-5.4 │ (ahsolanas@gmail.com) (oauth) │ ↳ LLM request timed out.
Perfect, this finally tells us the truth.
Diagnosis
Your auth is fine now.
The real failure is this:
format
↳ LLM request failed: provider rejected the request schema or tool payload.
So OpenClaw is reaching Ollama, but ollama/deepseek-coder:6.7b is rejecting the tool/schema payload OpenClaw sends.
That means the problem is model compatibility, not auth, not model name, and not Ollama connectivity.
What this means in practice
deepseek-coder:6.7b may chat fine in plain Ollama, but it’s not a good fit for tool-heavy OpenClaw agent use.
That matches a common pain point with local/Ollama models:
- they may stall
- reject tool payloads
- or behave badly with structured tool calling
My recommendation
Best practical fix
- Do not use
ollama/deepseek-coder:6.7bas your main agent model - keep Ollama for:
- cheap plain chat
- compaction/summaries
- experiments
- use a more tool-reliable model/provider for real coding agent work
If you want a low-cost setup
A good pattern is:
- main model = cheap-ish but tool-reliable cloud model
- compaction model = Ollama or another cheap model
- fallback = stronger cloud model when needed
Important
So you can stop chasing the auth config now.
That part looks fixed.
The remaining blocker is simply:
this Ollama model does not like OpenClaw’s request/tool schema.