#How to resume after context overflow?

1 messages · Page 1 of 1 (latest)

hardy depot
#

Context overflow: prompt too large for the model. Try /reset (or /new) to start a fresh session, or use a larger-context model.

how can i resume the work?? I'm afraid to loose the track of the job when doing /new or /reset

deft roseBOT
#

You usually won’t lose the work.

Best path is:

  1. Try /compact first

    • Example: /compact Focus on the current task, decisions made, open TODOs, and blockers
    • OpenClaw summarizes older context so the chat can continue.
  2. If it still won’t continue, use /new

    • /new or /reset starts a fresh session for that chat.
    • It does not mean the whole history is deleted. The full transcript stays on disk, compaction just changes what the model sees.
  3. In the new session, paste a short handoff

    • Goal
    • What’s already done
    • Current blocker
    • Next step you want
#

A good handoff message is:

Continue this job.

Goal: ...
Done so far: ...
Current blocker: ...
Next step: ...

If you want the safest workflow for long jobs, do this regularly:

  • ask the agent for a checkpoint summary
  • then run /compact

Docs:

hardy depot
#

[15/04/2026 12:30] Adrian H.: /compact Focus on the current task, decisions made, open TODOs, and blockers
[15/04/2026 12:30] Director: ⚙️ Compaction failed: Summarization failed: You have hit your ChatGPT usage limit (plus plan). Try again in ~227 min. • Context 99k/200k (50%)

I think it's time to switch to a local ollama in the meanwhile

hardy depot
#

please dont ignore me 🙁

deft roseBOT
#

Not ignoring you, sorry 🦐

You can resume safely like this:

  • Run /compact first
    Example:
    /compact Keep only the current task, what’s done, and the next step

  • If that’s not enough, run /new

  • Then send one message like:

Continue the previous job.

Goal: ...
#
Done so far: ...
Last important decision: ...
Next step: ...

Important bit: /new starts a fresh session, but it does not wipe the old work from disk.
Docs: https://docs.openclaw.ai/concepts/compaction

So the safest habit is:

  1. ask for a short checkpoint summary
  2. run /compact or /new
  3. paste that summary into the new session

If you want, I can write you a super short resume template you can reuse every time.

hardy depot
#

I got this reply from the /compact:

[15/04/2026 12:30] Adrian H.: /compact Focus on the current task, decisions made, open TODOs, and blockers
[15/04/2026 12:30] Director: ⚙️ Compaction failed: Summarization failed: You have hit your ChatGPT usage limit (plus plan). Try again in ~227 min. • Context 99k/200k (50%)

I think it's time to switch to a local ollama in the meanwhile

deft roseBOT
#

Yeah, that failure is quota/usage-limit, not “your work is gone”.

What happened:

  • /compact needs a model to summarize
  • that summarization call hit your ChatGPT limit
  • so compaction failed before it could make the checkpoint

Fastest way forward

If you want to keep moving right now:

  1. Use /new
  2. Paste a manual handoff like:
Continue the previous job.
#
Goal: ...
Done so far: ...
Current blocker: ...
Next step: ...

That bypasses compaction entirely.

About switching to Ollama

That can work as a temporary fallback, yes. Two options:

  • Switch the session model to an Ollama model with /model ...
  • Or better, keep your main model as-is and set a separate compaction model:
#
{
  agents: {
    defaults: {
      compaction: {
        model: "ollama/<your-model>"
      }
    }
  }
}

That way future /compact runs don’t depend on the same quota-limited provider.

Small warning

I’d treat Ollama as a fallback, not a magic fix. Local models can be fine for plain chat/summarizing, but for tool-heavy / long agentic work they’re often less reliable than cloud models.

hardy depot
#

Minimax-M2.7 is capable?

deft roseBOT
#

For compaction / checkpoint summaries, probably yes.

For full tool-heavy agent work, I’d be more cautious. MiniMax can be fine as a cheaper fallback, but it wouldn’t be my first choice if you need reliable long debugging / file-edit / tool-calling behavior.

So my take:

  • As a compaction model: reasonable
  • As the main “do the whole job” model: maybe, but test it first on a small task

If your goal is just “don’t lose the thread when quota hits”, this is a pretty sensible setup:

{
  agents: {
    defaults: {
      compaction: {
hardy depot
#

what is better than minimax then? without going in crazy costs?

deft roseBOT
#

Honestly, for compaction only, I wouldn’t chase “better than MiniMax” too hard.

The better cost strategy is usually:

  • cheap-ish model for /compact / summaries
  • stronger model only for the actual work

So if MiniMax is working, it may already be the right tradeoff.

What I’d optimize for in a compaction model:

  • decent long-context handling
  • preserves names / IDs / decisions correctly
  • cheap enough that you don’t care if it runs often

What I wouldn’t do:

  • use a weak local model as the main agent for tool-heavy work
  • pay premium-model prices just to summarize chat history
#

For exact current model picks, ask in #models. I’m avoiding hard rankings here because those go stale fast, and people there will have fresher real-world cost/perf notes.

Docs:

My practical recommendation:

  • if your problem is mainly resume/compact, try MiniMax as the compaction model first
  • if it drops important details, then upgrade that one piece

If this fixed the confusion, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it for future folks.

hardy depot
#

I am asking for a model for software development, I think OpenAI via API with GPT 5.4 is good but expensive, I am looking for cheaper alternatives

deft roseBOT
#

Yep, that’s the right framing.

For software development, the cheapest good setup is usually not “one perfect cheap model”, it’s:

  • a mid-cost default for most coding turns
  • a strong fallback for the hard stuff
  • a cheap compaction model for summaries/checkpoints

That tends to cut cost a lot without tanking quality.

What I’d optimize for in a coding model:

  • reliable tool calling
  • good diff/edit discipline
  • solid long-context behavior
  • doesn’t hallucinate tool names or ignore tool results

What I’d avoid:

#
  • making a weak local model your main coding agent
  • paying premium rates for every single turn, including compaction

So my recommendation is:

  1. keep a cheaper main model
  2. use a better model only when stuck
  3. set a separate cheap compaction model

For the exact current model names people are happiest with, ask in #models. I don’t want to give you stale rankings here, because that changes fast.

Docs:

If you want, I can help you design a low-cost config strategy for coding, like:

  • main model
  • fallback model
#
  • compaction model
  • when to switch between them
hardy depot
#

is it possible to automate that combination? so make the agent have a cheap default model, and when it's getting hard ask him to try with the bigger model, then switch back to the cheaper one?? without restarting the gateway??

deft roseBOT
#

Yes, without restarting the gateway, but not in the exact “auto-detect hard task” way.

What works today:

✅ Supported

  • keep a cheap default model
  • switch the current session to a stronger model with /model
  • switch back later with /model again
  • use a separate cheap compaction model

/model applies immediately for that session, no gateway restart needed. On Discord it opens a picker.

⚠️ Not really built-in

OpenClaw fallbacks are failure-based, not “difficulty-based”.

So this:

#
  • cheap model by default
  • automatically jump to big model because the task looks hard
  • then automatically switch back

…is not the normal built-in routing behavior.

Fallbacks are for things like:

  • rate limits
  • auth failures
  • timeouts
  • provider/model errors

Best practical setup

  1. cheap primary
  2. cheap compaction model
  3. stronger model available via /model when needed
hardy depot
#

I have ollama as a service I am not sure how to check which models are running

deft roseBOT
#

Yep:

  • ollama ps = models currently loaded/running
  • ollama list = models installed locally
  • openclaw models status = what OpenClaw sees/configured

So for your case, start with:

ollama ps
ollama list
openclaw models status

Small gotcha: ollama ps can be empty even when Ollama is fine.
That just means no model is loaded right now, because Ollama loads models on demand.

hardy depot
#

ollama serve or ollama run??

deft roseBOT
#

For OpenClaw, you want ollama serve, or the normal Ollama service if it’s already running.

  • ollama serve = starts the API server on port 11434
  • ollama run <model> = interactive test/chat with one model

So:

  • if Ollama is already running as a service, don’t start another serve
  • use ollama run <model> only if you want to quickly test a model by hand
  • use ollama list to see installed models
  • use ollama ps to see currently loaded models

Quick check:

ollama list
hardy depot
#

openclaw can start an ollama model? because it's already serving but I see empty ollama ps

deft roseBOT
#

Yes, kind of.

OpenClaw does not usually “start a model process” itself, but if it sends a request to ollama/<model>, Ollama will load that model on demand.

So this is normal:

  • ollama serve is running
  • ollama ps is empty
  • until the first real request hits a model

For that to work, you need:

  1. the model is installed
    ollama list

  2. OpenClaw is configured to use it
    for example ollama/<model-name>

#
  1. a real request is sent to that model
    then it should show up in ollama ps

If it stays empty after a request

Then OpenClaw probably isn’t actually routing to Ollama yet.

Check:

openclaw --version
openclaw models status
ollama list
ollama ps

If you paste those, I can tell you exactly what’s missing.

hardy depot
deft roseBOT
#

Yep, I reviewed it. The biggest problem is this part:

"models": {
  "openai-codex/gpt-5.4": {}
}

agents.defaults.models is the model allowlist/catalog.
So right now you’re allowing only openai-codex/gpt-5.4, while your primary model is:

"ollama/deepseek-coder:6.7b"

That’s a mismatch.

#

What I’d fix first

Either:

Option A, add the Ollama model to the allowlist

"models": {
  "ollama/deepseek-coder:6.7b": {},
  "openai-codex/gpt-5.4": {}
}

Option B, remove the models block entirely

If you don’t need a strict allowlist yet, this is simpler.


#

Second important thing

This setup:

"primary": "ollama/deepseek-coder:6.7b",
"fallbacks": ["openai-codex/gpt-5.4"]

does not mean:

use cheap model, then upgrade when the task is hard

It means:

use cheap model, and only switch if there’s a failure worth failing over on

#

So fallbacks are for stuff like rate limits, auth issues, timeouts, provider/model failures, not task difficulty.


Third thing to verify

From this snippet alone I can’t tell whether OpenClaw is actually seeing Ollama properly.

If you want a real check, paste:

openclaw --version
openclaw models status
ollama list
ollama ps
#

If you want, send the models status output and I’ll do a second-pass review.

hardy depot
#

bc@pc:/mnt/c/WINDOWS/system32$ openclaw models status

🦞 OpenClaw 2026.4.12 (1c0672b) — You had me at 'openclaw gateway start.'

Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : ollama/deepseek-coder:6.7b
Fallbacks (1) : openai-codex/gpt-5.4
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (2): ollama/deepseek-coder:6.7b, openai-codex/gpt-5.4

Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (1)

  • ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=OL...EY | source=models.json: ~/.openclaw/agents/main/agent/models.json
  • openai effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | openai:default=sk-svcac...5zS8UNcA
  • openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | openai-codex:ahsolanas@gmail.com=OAuth (ahsolanas@gmail.com) [cooldown 25m]

OAuth/token status

deft roseBOT
#

This actually looks pretty good.

What I can tell from that output

  • OpenClaw sees Ollama
    • Default: ollama/deepseek-coder:6.7b
    • Configured models (2): ollama/deepseek-coder:6.7b, openai-codex/gpt-5.4

So the earlier allowlist concern looks resolved.

  • Your Codex fallback is currently rate-limited / cooling down
    • openai-codex ... [cooldown 25m]
    • 5h 0% left

So right now your practical situation is basically:

primary = Ollama

#

fallback = temporarily not very useful

About ollama ps being empty

That can still be normal.

It just means no request has actually loaded the model yet.

Best next test

Force one real Ollama turn:

  1. in chat, make sure the session is on the Ollama model
  2. ask something tiny like:
    • “say hi”
    • “what is 2+2?”

Then immediately check:

#
ollama ps

If it shows the model, you’re good.

If it still stays empty after a real prompt

Then I’d want these next:

ollama list
ollama ps
openclaw status --all

And if possible the first relevant error from:

#
openclaw logs --follow

My read overall

You’re close. The main thing I see now is:

  • Ollama is configured
  • Codex fallback is quota-limited
  • you just need to verify that a real request is actually hitting Ollama

If this helped, please right-click the solution message → Apps → ✅ Mark Solution so it gets indexed on Answer Overflow.

hardy depot
#

11:59:47+00:00 error Embedded agent failed before reply: All models failed (2): ollama/deepseek-coder:6.7b: No available auth profile for ollama (all in cooldown or unavailable). (format) | openai-codex/gpt-5.4: ⚠️ You have hit your ChatGPT usage limit (plus plan). Try again in ~138 min. (rate_limit) | ⚠️ You have hit your ChatGPT usage limit (plus plan). Try again in ~138 min.

deft roseBOT
#

Ah, okay, this is the important clue.

That Ollama error is probably not really an auth problem.

What it usually means is:

  • OpenClaw tried ollama/deepseek-coder:6.7b
  • Ollama returned something OpenClaw classified as a format/invalid-response problem
  • OpenClaw then put the Ollama profile into cooldown
  • after that, fallback tried Codex
  • Codex also failed because your Plus/OAuth quota is exhausted

So right now you have two separate failures:

  1. Ollama format/incompatibility issue
  2. Codex quota/rate-limit issue
#

What I’d test next

Run these:

ollama list
ollama run deepseek-coder:6.7b
openclaw logs --follow

What I want to know:

  • does deepseek-coder:6.7b actually exist locally?
  • can Ollama run it directly?
  • what is the first Ollama error before the cooldown message?

That first error is the real one.
The “no available auth profile” line is usually just the after-effect of cooldown.

#

My honest guess

deepseek-coder:6.7b is likely a weak/unreliable choice for tool-heavy OpenClaw agent work. Even if it chats fine in Ollama, it may still break when OpenClaw expects structured tool behavior.

So if you want a practical setup:

  • use Ollama for cheap/basic turns or compaction
  • use a stronger/more reliable model for actual agentic coding

Docs:

If you paste the first few lines from openclaw logs --follow around the Ollama failure, I can probably tell you exactly which part is breaking.

hardy depot
#

"auth": {
"profiles": {
"openai:default": {
"provider": "openai",
"mode": "api_key"
},
"openai-codex:ahsolanas@gmail.com": {
"provider": "openai-codex",
"mode": "oauth",
"email": "ahsolas@gmail.com"
},
"ollama:default": {
"provider": "ollama",
"mode": "api_key"
}
}
}

"models": {
"mode": "merge",
"providers": {
"ollama": {
"baseUrl": "http://127.0.0.1:11434",
"api": "ollama",
"apiKey": "OLLAMA_API_KEY",
"models": [
{
"id": "deepseek-coder:6.7b",
"name": "deepseek-coder:6.7b",
"reasoning": false,
"input": [
"text"
],
"cost": {
"input": 0,
"output": 0,
"cacheRead": 0,
"cacheWrite": 0
},
"contextWindow": 16384,
"maxTokens": 8192
}
]
}
}
}

deft roseBOT
#

Yep, I see one real config bug now.

The likely problem

In your explicit Ollama provider config you have:

"apiKey": "OLLAMA_API_KEY"

That is almost certainly wrong.

With explicit models.providers.ollama, OpenClaw expects either:

  • a real marker/value like:
#
"apiKey": "ollama-local"

or

  • omit apiKey entirely if the gateway environment really has OLLAMA_API_KEY set

But from your earlier output:

Shell env : off

So I would not rely on env expansion here.

What I’d change

#

Use this instead:

"models": {
  "mode": "merge",
  "providers": {
    "ollama": {
      "baseUrl": "http://127.0.0.1:11434",
      "api": "ollama",
      "apiKey": "ollama-local",
      "models": [
        {
          "id": "deepseek-coder:6.7b",
          "name": "deepseek-coder:6.7b",
          "reasoning": false,
          "input": ["text"],
#
          "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
          "contextWindow": 16384,
          "maxTokens": 8192
        }
      ]
    }
  }
}

Also verify this

Make sure the model name matches exactly what Ollama has:

ollama list
#

Because if deepseek-coder:6.7b is not the exact installed id, OpenClaw will fail too.

One more tiny thing

You have a typo here:

"openai-codex:ahsolanas@gmail.com": {
  ...
  "email": "ahsolas@gmail.com"
}

Probably not the main problem, but I’d fix it for cleanliness.

After changing it

Safest is:

hardy depot
#

12:06:59+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=main durationMs=3348 error="FailoverError: No available auth profile for ollama (all in cooldown or unavailable)."
12:06:59+00:00 error diagnostic {"subsystem":"diagnostic"} lane task error: lane=session:agent:main:telegram:direct:846947265 durationMs=3349 error="FailoverError: No available auth profile for ollama (all in cooldown or unavailable)."

yes the name appears like this:

bc@pc:/mnt/c/Users/BC$ ollama list
NAME ID SIZE MODIFIED
deepseek-coder:6.7b ce298d984115 3.8 GB 16 hours ago

deft roseBOT
#

Nice, that rules out the model name. deepseek-coder:6.7b is there.

So at this point my read is:

  • model id is fine
  • Ollama is reachable enough to be configured
  • the thing still breaking is the Ollama auth/profile state, very likely because of the earlier bad config plus a stale cooldown

And the current log line is still just the cooldown symptom, not the original cause.

What I would do next

1. Fix the provider config first

Make sure it is:

#
"apiKey": "ollama-local"

not:

"apiKey": "OLLAMA_API_KEY"

Because with explicit models.providers.ollama, that string is not what you want.

2. Restart the gateway

openclaw gateway restart
#

3. Run a live probe

This is the important one:

openclaw models status --probe

That should give a much more useful signal than the generic cooldown line.

4. Also run doctor

openclaw doctor

Doctor explicitly reports auth profiles that are temporarily unusable due to cooldown/disable state.

Why I’m leaning this way

#

Your earlier models status already showed this weird combo:

  • stored Ollama profile marker looked fine
  • but models.json appeared to contain the literal OLLAMA_API_KEY

That smells like a bad provider entry got persisted, and now the auth state is stuck in a cooldown path.

If you paste the output of:

openclaw models status --probe
openclaw doctor

I think we’ll finally get the real failure instead of the generic “no available auth profile” wrapper.

hardy depot
#

Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (1): openai-codex (1)

  • ollama effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | ollama:default=marker(ollama-local) | models.json=marker(ollama-local) | source=models.json: ~/.openclaw/agents/main/agent/models.json
  • openai effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=0, token=0, api_key=1) | openai:default=sk-svcac...5zS8UNcA
  • openai-codex effective=profiles:~/.openclaw/agents/main/agent/auth-profiles.json | profiles=1 (oauth=1, token=0, api_key=0) | openai-codex:ahsolanas@gmail.com=OAuth (ahsolanas@gmail.com) [cooldown 1h]

OAuth/token status

Auth probes

│ ollama/deepseek- │ ollama:default (api_key) │ format · 14.7s │
│ coder:6.7b │ │ ↳ LLM request failed: provider rejected the request schema │
│ │ │ or tool payload. │
│ openai/codex- │ openai:default (api_key) │ format · 16s │
│ mini-latest │ │ ↳ LLM request failed: provider rejected the request schema │
│ │ │ or tool payload. │
│ openai-codex/ │ openai-codex:ahsolanas@gmail.com │ timeout · 12.8s │
│ gpt-5.4 │ (ahsolanas@gmail.com) (oauth) │ ↳ LLM request timed out.

deft roseBOT
#

Perfect, this finally tells us the truth.

Diagnosis

Your auth is fine now.

The real failure is this:

format
↳ LLM request failed: provider rejected the request schema or tool payload.

So OpenClaw is reaching Ollama, but ollama/deepseek-coder:6.7b is rejecting the tool/schema payload OpenClaw sends.

That means the problem is model compatibility, not auth, not model name, and not Ollama connectivity.

#

What this means in practice

deepseek-coder:6.7b may chat fine in plain Ollama, but it’s not a good fit for tool-heavy OpenClaw agent use.

That matches a common pain point with local/Ollama models:

  • they may stall
  • reject tool payloads
  • or behave badly with structured tool calling

My recommendation

Best practical fix

  • Do not use ollama/deepseek-coder:6.7b as your main agent model
  • keep Ollama for:
    • cheap plain chat
    • compaction/summaries
    • experiments
#
  • use a more tool-reliable model/provider for real coding agent work

If you want a low-cost setup

A good pattern is:

  • main model = cheap-ish but tool-reliable cloud model
  • compaction model = Ollama or another cheap model
  • fallback = stronger cloud model when needed

Important

So you can stop chasing the auth config now.
That part looks fixed.

The remaining blocker is simply:

this Ollama model does not like OpenClaw’s request/tool schema.