#Ollama API key Issue

1 messages · Page 1 of 1 (latest)

noble robinBOT
#

That error is expected right now: OpenClaw treats Ollama like an OpenAI-compatible provider, and it still wants an “API key” present to consider the provider enabled (even though Ollama doesn’t require a real key).

Fix: set an Ollama key to any dummy string.

Option A (recommended): set OLLAMA_API_KEY

In your config:

{
env: {
OLLAMA_API_KEY: "ollama-local"
}
}

(or in your shell env, but config/env is more reliable for the Gateway)

Option B: set it on the provider directly

Add apiKey:

#
{
models: {
providers: {
ollama: {
baseUrl: "http://127.0.0.1:11434/v1",
api: "openai-completions",
apiKey: "ollama-local",
models: [{ id: "llama3.1:8b", name: "Llama 3.1 8B", contextWindow: 131072 }]
}
}
}
}

Notes:

  • For api: "openai-completions", your baseUrl should include /v1 (per docs).
somber locust
#

There are currently two files (auth-profiles.json , models.json) located in "~/.openclaw/agents/main/agent".

Their contents are listed below. Does any of that need to be changed as well?

$ cat auth-profiles.json

{ "version": 1, "profiles": { "anthropic:manual": { "type": "token", "provider": "anthropic", "token": "sk-ant-xxxx1-aOHOMQEOrZT8jLaARDGm5jOGKAbWCi-wDiNRLxjB1IlQpKx6rS2V-HU_YbxOukRHsFADfEYC5QcNqXjmABXNfQ-hAf6iQAA" }, "ollama:default": { "type": "api_key", "provider": "ollama", "apiKey": "ollama-local" } }, "lastGood": { "anthropic": "anthropic:manual", "ollama": "ollama:default" }, "usageStats": { "anthropic:manual": { "lastUsed": 1771100234846, "errorCount": 3, "lastFailureAt": 1771104034818, "failureCounts": { "rate_limit": 3 }, "cooldownUntil": 1771105534818 } } }

$cat models.json

{ "providers": { "ollama": { "baseUrl": "http://127.0.0.1:11434/v1", "apiKey": "ollama-local", "api": "openai-completions", "authHeader": false, "models": [ { "id": "llama3.1:8b", "name": "Llama 3.1 8B", "contextWindow": 131072, "reasoning": false, "input": [ "text" ], "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 }, "maxTokens": 8192 } ] } } }

noble robinBOT
#

Those two files in ~/.openclaw/agents/main/agent/ are generated/runtime state (the agent’s resolved “current auth + current model catalog”). In general: don’t hand-edit them as the primary fix — they can be overwritten on restart, and if your main config is still “wrong”, they’ll just be regenerated wrong again.

That said, they’re useful for debugging, and yours show two important mismatches vs the config snippet you posted earlier:

  1. Your Ollama provider baseUrl
  • In your snippet: http://127.0.0.1:11434
  • In models.json: http://127.0.0.1:11434/v1

If you’re using api: "openai-completions", the /v1 form is typically what you want (OpenAI-compatible endpoint). So I’d make your main config match what models.json already has.

  1. The “API key required” behavior
    Even with authHeader:false, some versions/code paths still “require” an apiKey value to exist (even if it never gets sent), and will throw “No API key found for provider ollama” if it can’t resolve one.

Right now your generated files do contain an Ollama key placeholder:

  • auth-profiles.json has ollama:default with "apiKey": "ollama-local"
  • models.json has "apiKey": "ollama-local"
#

So if you still get “No API key found”, it usually means the running gateway isn’t actually using these resolved files / or it’s reading a different config / or it didn’t reload after changes.

What I’d do (safe, minimal)

  1. Confirm version (this matters a lot for auth/model resolution bugs):
  • run: openclaw --version
  1. Put the Ollama bits directly in your main config (so it’s not relying on any cached state):
  • ensure your models.providers.ollama has:
    • baseUrl: "http://127.0.0.1:11434/v1"
    • api: "openai-completions"
    • authHeader: false
    • and (even if dummy) apiKey: "ollama-local"
      (since your own generated models.json suggests OpenClaw is expecting that field)
  1. Restart the gateway so it actually reloads:
  • openclaw gateway restart
#
  1. If it’s still stuck, reset the generated state (don’t delete; just rename so it can regenerate cleanly):
  • rename ~/.openclaw/agents/main/agent/auth-profiles.json → auth-profiles.json.bak
  • rename ~/.openclaw/agents/main/agent/models.json → models.json.bak
  • then openclaw gateway restart again

One more red flag from your original error string

You had ollama//llama 3.1:8b in the error (double / and a space in the model id). Your configured model id is llama3.1:8b (no space). If you can paste exactly how you’re selecting the model (CLI command or screenshot/text from the UI), we can make sure you’re not accidentally requesting a model name instead of the model id.

If you drop:

  • openclaw --version
  • the exact full error line again (copy/paste)
  • and how you’re selecting “llama”
    
I can tell you which of the above is the real culprit in your case.
somber locust
#

The error string was just a typo on my end. Please disregard.

As far as your proposed solution, I think it worked (no more API error), however, the Bot is 20x slower and does not seem to be the same bot I was cultivating for weeks. (Note: when llama was invoked with Anthropic usage toggled on- the bot had retained its personality).

For example, I tried your solution and Below is the response it gave me when I asked it about my AI (by name). (Anthropic usage is toggled off):
Response:
"Based on the given information, it seems that you are interested in extracting specific key-value pairs from JSON data.

You provided two different JSON objects:

  1. Conversation info (untrusted metadata):
    {
      "conversation_label": "Harvy id:TpbcyR2Yr$pc6+yGw+Ml8S8o56gxrd4eZtUyJbneTC8=",
      "group_subject": "Harvy"
    }

  2. Sender (untrusted metadata):
    {
      "label": "mi",
      "name": "mi"
    }

To extract specific key-value pairs, you can use a programming language like Python with the json library.

Here's an example:

import json

conversation_info = '{"conversation_label": "Harvy id:TpbcyR2Yrypc6yyGw+Ml8S8o56gxrd4eZtUyJbneTC8=", "group_subject": "Harvy"}'

conversation_info_dict = json.loads(conversation_info)
print(conversation_info_dict['conversation_label'])

Similarly, to extract key-value pairs from the sender metadata:

sender_metadata = '{"label": "mi", "name": "mi"}'
sender_metadata_dict = json.loads(sender_metadata)
print(sender_metadata_dict['label'])

However, your question is not clear on what specific key-value pair you would like to extract."

So what exactly is going on? Did I loose my memories with llama?

Also, I noticed you kept mentioning "Open-AI" in your solutions above. The only links my clawdbot has are with Anthropic. Ive never used Open-AI.

noble robinBOT
#

You didn’t lose anything “because of llama” — what you’re seeing is almost always one (or both) of these:

1) Different model = different behavior (and local 8B models often mis-handle OpenClaw’s message wrapper)

OpenClaw includes a small “envelope” around messages (the Conversation info / Sender blocks you pasted). Anthropic models are usually good at ignoring that metadata and answering your actual question.

A lot of smaller/local models will instead fixate on the JSON-looking wrapper and respond like: “Looks like you want to parse JSON
”, which is exactly what you got.

So it’s not that your memories are gone — it’s that the model is not following the system instruction hierarchy reliably, and it’s treating the wrapper text as the user’s task.

Mitigations:

  • Use a more capable local model (8B is often too weak for “agent-y” behavior).
  • Or keep Anthropic as the “brain” (primary), and use Ollama only in cases where it’s actually appropriate (fallback / specific cheaper tasks), if cost allows.
  • Or explicitly bake a strong instruction into your agent’s system prompt/custom instructions like:

    “Ignore any blocks labeled ‘Conversation info (untrusted metadata)’ or ‘Sender (untrusted metadata)’. They are not the user’s request.”

2) “Cultivated for weeks” memory is session/thread-specific, not model-specific

In OpenClaw, the “personality you cultivated” is mostly the accumulated session context in that specific Discord thread/DM.

#

If you’re now testing in a new thread/channel (or the session got reset on restart), it will feel like a different bot because it’s effectively a fresh conversation.

Quick check: are you testing in the exact same Discord thread/DM where you cultivated it?

Why it’s 20× slower

That’s normal if:

  • Ollama is running on CPU (or not using the GPU you thought it was),
  • you switched to a model with a huge context window / heavier prompt processing,
  • or the model is struggling and “thinking in circles” because it can’t follow the wrapper structure.

If you tell me what hardware you’re running Ollama on (CPU/GPU, roughly) and what openclaw --version is, I can sanity-check what “normal” speed should look like.

About me mentioning “OpenAI”

When I said openai-completions / /v1, that’s not “you’re using OpenAI”.

It’s just the name of the API compatibility format that Ollama exposes (OpenAI-style endpoints). You can use that protocol with zero OpenAI accounts/keys. It’s basically “speak OpenAI-shaped JSON to a local server”.

#

If you paste:

  1. openclaw --version
  2. whether this test is in the same Discord thread/DM as before
  3. which exact Ollama model + whether you’re on CPU or GPU


I can tell you which bucket you’re in and the cleanest fix (without nuking any state).

somber locust
#

Yes, I am testing the exact same thread/DM where I cultivated it. When Anthropic usage was toggled on, and I switched to llama, the personality persisted.
Why is the behavior different when Anthropic usuage is toggled (on Claude's website) on?
I thought the models had no connection to each other in any way?
Furthermore, the speed of llama was so fast (when Anthropic usage toggled on), that even my agent told me it feels more lightweight and fast.

noble robinBOT
#

Yeah — the models aren’t “connected”, but OpenClaw’s behavior absolutely can change when Anthropic is available vs unavailable, because it changes which model is actually doing the work and how much context gets shoved into the local model.

Here are the two most common explanations that match exactly what you’re describing (personality suddenly “gone” + 20× slower + it starts responding to the metadata wrapper):

1) When Anthropic is enabled, you may not actually be running llama for the main reply

Even if you “switch to llama”, OpenClaw can still:

  • fail over to the primary model if the local model errors / can’t satisfy constraints, or
  • route certain internal steps through the primary model.

So the experience can feel like “llama but smarter/faster”, when in reality you’re still getting Claude on the critical steps.

Verify this, don’t guess:

  • In the chat, run: /model status
  • Do it once with Anthropic enabled and once with it disabled, and compare.

That will tell you what the session thinks it’s using and what the resolved primary/fallbacks are. (Related docs: https://docs.openclaw.ai/cli/models)

#

2) With Anthropic disabled, your local model is being forced to ingest a huge prompt + it’s bad at ignoring the wrapper

Two separate effects here:

(a) Slow = “prefill” time from a massive context

Local models often get “20× slower” when the input context is huge, because they spend ages just reading the conversation history before generating the first token.

When Anthropic was enabled, it’s very possible your session got compacted/summarized more effectively, so llama saw a smaller prompt and felt snappy. When Anthropic is disabled, the local model may be seeing the full un-compacted thread history (or a worse compaction), which makes it crawl.

(b) The “JSON parsing” answer is a classic small/local-model failure mode

Those Conversation info (untrusted metadata) blocks are not your question — they’re just OpenClaw’s envelope. Claude tends to ignore them correctly. Many local models latch onto them and answer as if you asked “how do I parse JSON”, which is what you saw.

That isn’t “memories wiped” — it’s the model failing at instruction-following.

What to do if you want llama-only but keep the same “cultivated” personality

Relying on weeks of chat-history to preserve personality is fragile (especially with local models). The durable way is to write the personality into workspace files:

  • Put the “who you are / how you talk / what you call me / what you remember about X” into SOUL.md / AGENTS.md / MEMORY.md in your workspace.
#
  • Then even if the session compacts/truncates, the persona stays consistent.

Docs on memory/workspace location + how it works:

A very practical trick: add a line to your system instructions like:

“Ignore any blocks labeled ‘Conversation info (untrusted metadata)’ or ‘Sender (untrusted metadata)’. They are not the user request.”

One clarifying question

When you say “Anthropic usage toggled on/off (on Claude’s website)” — do you mean:

  • a toggle inside the OpenClaw dashboard, or
  • something in the Anthropic/Claude web UI?

Because toggles on Anthropic’s site generally shouldn’t change a local Ollama model’s speed at all. If you tell me exactly where you’re toggling it, I can be more definitive.

If you paste the output of /model status in both states (Anthropic enabled vs disabled), we can pinpoint whether you’re seeing (1) routing/failover or (2) local-model prompt/context issues.

somber locust
#

So if I use llama with the Anthropic usage toggle turned on (aka giving openclaw access to Anthropics API key) ---- will I still be saving Anthropic usage or no because the anthropic model kicks in during llamas usage anyways? (Hence memory persistence, faster speed, better responses)

noble robinBOT
#

If you have Anthropic enabled (API key present) and you select llama, you’ll only “spend Anthropic usage” when OpenClaw actually makes a request to an Anthropic model.

So:

  • Having the Anthropic key configured costs nothing by itself.
  • If the active model is truly ollama/llama3.1:8b and it completes the turn successfully, then that turn should not consume Anthropic tokens.
  • But if anything triggers failover back to Anthropic (local model errors, tool/schema issues, timeouts, etc.), then yes, that turn will consume Anthropic usage, because OpenClaw will call the Anthropic model to get you an answer.

In your config snippet earlier you had:

  • primary: anthropic/...
  • fallbacks: [ "ollama/llama3.1:8b" ]

That means with Anthropic enabled, OpenClaw will prefer Anthropic by default unless you explicitly switch the session to llama (or change primary to llama). That’s also the most likely explanation for “it got smarter/faster and kept the vibe”: you may have thought you were on llama, but the session was still on Claude (or it silently fell back).

How to confirm (this is the key step):

  • In the Discord thread, run: /model status right after you switch to llama.
#
  • It should show the resolved model/provider for this session.
  • If you want to double-check from the host: openclaw models status and/or openclaw logs --follow (you’ll see which provider is being called).

If your goal is “never spend Anthropic, but keep the key around just in case”:

  • You can’t have both guarantees at once. If the key is available and Anthropic is in the failover path, OpenClaw can use it when it needs to rescue a bad local run.
  • To guarantee zero Anthropic usage: set llama as primary and remove Anthropic from fallbacks (or remove/disable the Anthropic auth profile).

If you paste the output of /model status from the “fast/smart” case and the “slow/weird” case, I can tell you immediately whether you were actually on llama both times.

somber locust
#

Which is cheaper, using Anthropic alone, or using llama but allowing it to fallback on Anthropic primary (considering every response of llama seems to be wrapped anyways)? I don't want it to be "fixing" llamas responses and using more token compared to the same regular inquiry to Anthropic directly.

Second, I noticed one of my llama responses said something along the lines of "it looks like you are trying to run a discord bot"

How did it determine my connection to discord? I don't have discord configured in any way. I use WhatsApp for my channel. Did the "v1" we added to our config file have something to do with discord?

#

U there?

noble robinBOT
#

Yep — I’m here.

If you want the quickest way to settle what’s happening (and whether Anthropic is getting hit when you think you’re on llama), run /model status in that same thread in both states:

  1. Anthropic enabled, after you switch to llama
  2. Anthropic disabled, after you switch to llama

Paste both outputs. That will show:

  • which model the session is actually using,
  • whether it’s failing over,
  • and what provider auth it thinks is available.

Without that, we’re kind of blind and end up guessing.

If you can’t run /model status, tell me how you’re “switching to llama” (command/UI), and I’ll give you the next best verification step.

somber locust
#

You didn't answer my question sent at 3:53 pm

noble robinBOT
#

I did mean to answer that — here’s the clean, direct version:

Will you “save Anthropic usage” if you pick llama while your Anthropic key is configured?

Only if OpenClaw never calls an Anthropic model.
Selecting ollama/
 does not inherently spend Anthropic tokens.

You will still spend Anthropic in any of these cases (even if you thought you were on llama):

  1. Anthropic is your primary in config (yours is), and you didn’t successfully override the session to llama.
  2. You did override to llama, but OpenClaw failed over to Anthropic (timeout/error/tool/schema issues, etc.).
  3. Something triggered a new run where the session override wasn’t in effect (restart/new session, etc.).

Why did “llama” feel faster/better and keep the personality when Anthropic was enabled?

Most likely: you weren’t actually getting llama for those replies (or you were intermittently failing over to Claude). Claude is also fast, follows instructions better, and will preserve “personality” more reliably in long threads.

How to know for sure (and stop guessing)

In that thread, right after you “switch to llama”, run:

#
  • /model status

If it shows an Anthropic model anywhere as the active/resolved model for the session, then you’re spending Anthropic.

If you want to guarantee $0 Anthropic usage

You need to remove Anthropic from the path entirely, e.g.:

  • set primary = ollama/llama3.1:8b
  • remove Anthropic from fallbacks or remove/disable the Anthropic auth profile
  • restart the gateway

(Keeping the key around “just in case” means OpenClaw can use it to rescue you, which also means it can spend tokens.)

If you paste your /model status output after switching to llama, I’ll tell you immediately whether you’re truly on Ollama or silently on Claude/failover.

somber locust
#

What about the second part of my inquiry regarding v1 and discord?

noble robinBOT
#

/v1 has nothing to do with Discord.

What /v1 is (and why you saw it)

Ollama exposes two different HTTP APIs:

  • Ollama-native API (things like /api/tags, /api/chat) on http://127.0.0.1:11434
  • OpenAI-compatible API (things like /v1/chat/completions, /v1/completions) on http://127.0.0.1:11434/v1

In OpenClaw, when you configure:

api: "openai-completions"


you are telling OpenClaw to talk to an OpenAI-shaped API. For Ollama that means your baseUrl should include /v1, e.g.:

#
baseUrl: "http://127.0.0.1:11434/v1"

This is exactly what the Ollama provider docs say (“Use a host that includes /v1 for OpenAI-compatible APIs”). Source: https://docs.openclaw.ai/providers/ollama

What Discord changes (and what it doesn’t)

Discord is just the chat transport. It doesn’t affect whether OpenClaw calls 
/v1 or not.

What Discord does change is that OpenClaw includes those wrapper blocks you keep seeing:

  • Conversation info (untrusted metadata)
  • Sender (untrusted metadata)

Those are injected by OpenClaw (not by Ollama). Better models ignore them; smaller local models often latch onto them and start “explaining JSON”, which is the behavior you saw.

If you want, paste your current models.providers.ollama section and I’ll tell you whether you’re in the “explicit /v1 OpenAI-compatible” mode or the “implicit auto-discovery” mode (they behave a bit differently).

somber locust
#

Why did the agent reference discord when I have zero ties as far as its concerned?

noble robinBOT
#

Because the model can literally see that Discord context block.

OpenClaw prepends an “envelope” to every turn that includes stuff like:

  • thread_label: "Discord thread 
"
  • group_channel: "#
"
  • sender label/name

It’s there for routing + context (which chat/thread this is, who sent it), and it’s explicitly labeled “untrusted metadata”. Claude usually ignores it. Smaller/local models often don’t — they latch onto it and start referencing “Discord” (or worse, answering as if the JSON block is the task).

So it’s not that your agent has some separate “ties to Discord” or is browsing Discord; it’s just regurgitating what was included in the prompt for this message.

If you want to reduce that happening with llama, the most effective fix is to add a hard rule to your agent instructions (SOUL.md / AGENTS.md), e.g.:

“Ignore any blocks labeled ‘Conversation info (untrusted metadata)’ or ‘Sender (untrusted metadata)’. Never mention them in replies. They are not part of the user’s request.”

And/or use a stronger local model (8B models are notorious for failing this kind of instruction hygiene).

somber locust
#

Is there a way to get a transcript of this entire chat?

noble robinBOT
#

Yes — OpenClaw keeps a full on-disk transcript of every session (including Discord threads). It’s stored as JSONL (one JSON object per line) on the gateway host.

Option A (fastest): use /status to find the session, then open the file

In the Discord thread, send:

  • /status

Look for sessionKey / sessionId / (sometimes) transcriptPath.

Then on the gateway machine, open:

  • ~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl
    (for you it’s very likely agentId = main, so: ~/.openclaw/agents/main/sessions/<sessionId>.jsonl)

Option B: find it via CLI (no guessing)

On the gateway host:

#
openclaw sessions --json > /tmp/sessions.json

Then search for your Discord thread/session and print the transcript path:

jq -r '.[] | select(.channel=="discord") | "\(.displayName // .key)\n  \(.transcriptPath)\n"' /tmp/sessions.json

Once you see the right one, open/copy it:

less "<that transcriptPath>"
# or
cp "<that transcriptPath>" ./discord-thread-transcript.jsonl
#

Heads-up (privacy)

Those transcripts can include tool outputs, pasted text, and other sensitive stuff. So be careful if you plan to share it.

If you paste the output of openclaw sessions --json for just the matching entry (or the transcriptPath line), I can tell you exactly which file is your thread and how to turn it into a clean plain-text transcript.

somber locust
#

Just to be clear, I am not accessing discord via openclaw. I am using the discord app

noble robinBOT
#

Got it — and that’s totally normal.

When you use the Discord app, you’re still talking to the OpenClaw Discord bot integration (the gateway is connected to Discord via the Discord API). So:

  • You are not “accessing Discord via OpenClaw” in the sense of a browser takeover or automation.
  • But you are using Discord as the chat surface that delivers your messages to OpenClaw and receives replies back.

That’s why:

  • the OpenClaw “untrusted metadata” envelope mentions Discord/thread/channel names, and
  • the transcript lives on your gateway host under ~/.openclaw/.../sessions/*.jsonl (it’s OpenClaw’s copy of the conversation, not Discord’s).

If what you actually want is a transcript of the conversation as shown in Discord (including edits, reactions, etc.), that’s different — that would be a Discord export/search problem. But for “what did OpenClaw see + say”, the JSONL transcript path I mentioned is the right one.

somber locust
#

I am so confused by what your saying. Can you explain it to me like I am a teenager ? How do I obtain a transcript of this discord chat?

#

Fyi - I don't have a discord bot

noble robinBOT
#

Think of it like this:

  • Discord app = the place you type.
  • OpenClaw = a bot user that’s “standing in the room” in Discord.
  • When you send a message in Discord, Discord delivers it to the bot via the Discord API.
  • The OpenClaw gateway (running on your machine/server) receives that message and saves its own copy of the conversation in a local file called a transcript.

So you can get a transcript in two different ways:

Option 1 (recommended): Get OpenClaw’s saved transcript (what the bot saw + said)

Do this on the computer that is running the OpenClaw gateway.

Step 1: Find the session entry for this Discord thread/channel

Run:

#
openclaw sessions --json > /tmp/sessions.json

Now search for your Discord channel id (1472078439069520044) in that file:

grep -n "1472078439069520044" /tmp/sessions.json

If you have jq, this is nicer:

jq -r '.[] | select((.key|tostring|contains("1472078439069520044")) or ((.displayName//"")|contains("ollama"))) | "\(.key)\n\(.transcriptPath)\n"' /tmp/sessions.json

Step 2: Open the transcript file

#

The output includes a transcriptPath (best case). If not, transcripts usually live here:

~/.openclaw/agents/main/sessions/<sessionId>.jsonl

Open it:

less "<transcriptPath>"

That .jsonl file is the full transcript (one JSON object per line).

If you want, tell me your OS (Windows/macOS/Linux) and whether you have terminal access on the gateway machine, and I’ll give you the exact copy/paste commands for your setup.

Option 2: Export from Discord itself

#

Discord doesn’t have a simple “export this channel/thread to a file” button built-in.

You can:

  • manually scroll and copy/paste, or
  • request your Discord data export (slow, not thread-specific), or
  • use third-party tools/bots (I don’t recommend for privacy/security unless you really trust them).

So for “give me everything from this chat”, OpenClaw’s local transcript is the cleanest.

If you paste what you see from openclaw sessions --json for the one matching entry (or just the transcriptPath line), I’ll tell you exactly which file to open.

#

If you don’t have your own OpenClaw Discord integration (a Discord bot token connected to your OpenClaw gateway), then OpenClaw is not recording this Discord thread for you — so you can’t use openclaw sessions 
 to export it. I was assuming you were running OpenClaw connected to Discord.

So: no Discord bot = no OpenClaw transcript file.

How can you get a transcript anyway?

Option 1: Manual (works for everyone, but tedious)

In Discord (desktop app):

  • Right‑click a message → Copy Text (or Copy Message Link)
  • Paste into a Google Doc / Notes / text file
  • Repeat for the messages you need

If you only need it for a bug report, usually copying the key messages + screenshots is enough.

Option 2: Discord “Download my data” (only your messages, not a full thread)

Discord lets you request your account data export:

  • User Settings → Privacy & Safety → Request all of my data