#http 404 file not found

1 messages · Page 1 of 1 (latest)

stuck cradleBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

spring totemBOT
#

A 404 here usually means OpenClaw is hitting an endpoint that your llama.cpp server isn’t actually serving (wrong base URL prefix, wrong path like /chat/completions, or the model id doesn’t match what the server exposes).

A couple quick checks to pinpoint it:

  1. What OpenClaw version?
openclaw --version
  1. Does your llama.cpp server actually have the OpenAI-style routes at that URL?
    Run these on the same machine where the OpenClaw gateway runs:
curl -i http://127.0.0.1:8003/v1/models
  • If that is 404, your baseUrl is wrong for how llama.cpp was started (some setups serve OpenAI compat at / not /v1, or behind a different prefix/port).
  1. Does chat completions exist?
#
curl -i http://127.0.0.1:8003/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"PUT_MODEL_ID_FROM_/v1/models_HERE","messages":[{"role":"user","content":"hi"}]}'
  • If this is 404 but /v1/completions works, you’re on a llama.cpp build/config that doesn’t serve chat/completions (OpenClaw’s openai-completions expects chat-style).
  1. Model id mismatch is common with llama.cpp
    Whatever /v1/models returns as the model id must match your config id, or you need to start llama.cpp with an alias so it matches qwen3.5-35b.

Two config notes from your snippet

  • For llama.cpp, I’d set this model to text only unless you know your server supports vision:
    • change input: ["text","image"] → input: ["text"]
    • and don’t point agents.defaults.imageModel at it (browser/tooling flows often require a real vision-capable model)
#
  • Your baseUrl may need to be either:
    • http://127.0.0.1:8003/v1 or
    • http://127.0.0.1:8003
      depending on how llama.cpp is serving the OpenAI routes (the curl tests above tell you which).

Docs that match this setup: https://docs.openclaw.ai/gateway/local-models and the custom provider section in the config reference: https://docs.openclaw.ai/gateway/configuration-reference

If you paste:

  • openclaw --version
  • the output of curl -i .../v1/models
  • and the exact 404 line from openclaw logs --tail 100


I can tell you exactly which URL OpenClaw is calling and what to change.

hollow rapids
#

curl -i http://127.0.0.1:8003/v1/models

Keep-Alive: timeout=5, max=10

Content-Type: application/json; charset=utf-8

Server:

Llama Content

Access-Control-Allow-Origin:

L{"name": "Qwe ':"Qwen3.5-27B-Q4_K_M.gguf _at":"","size"-"","digest"-"", "type":"model", ities": ["completion"], "parameters":"# "details": {"parer quantization_level":}], "object": "List", "data": [f"id": "Qwen3.5-278-04 K ,"created":1772164446, "owned_by: "llamacpp", "meta": ("vocab_type":2, "n vocab": 248320 ctx wen 6 0d K.gguf, "object*: "model 5120,"n_params":26 , "size":16729819136curl -i http://127.0.0.1:8003/v1/models'C

spring totemBOT
#

/v1/models working means your baseUrl is reachable. The two big red flags in what you pasted are:

  1. Your llama.cpp server is advertising “completion” capability only
    Your /v1/models output shows something like capabilities: ["completion"]. In llama.cpp that often means it serves /v1/completions but not /v1/chat/completions.

OpenClaw’s api: "openai-completions" expects the chat endpoint (POST /v1/chat/completions). If that route isn’t there, you’ll get a 404.

Can you run this and tell me the status code?

curl -i http://127.0.0.1:8003/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model":"whatever","messages":[{"role":"user","content":"hi"}]}'
  • If 404 → that’s the issue: llama.cpp isn’t running in chat mode / doesn’t expose chat-completions. Fix is to enable chat in llama.cpp (chat template/jinja, newer build), or put a proxy in front (LiteLLM/LM Studio/vLLM) that provides /v1/chat/completions.
  1. Your model id doesn’t match what llama.cpp reports
    You configured:
#
{id: "qwen3.5-35b"}

But /v1/models is returning an id like Qwen3.5-27B-Q4_K_M... (gguf-ish name). OpenClaw will send the configured id to the server; if they don’t match you’ll fail (usually 400/404 depending on server).

Run:

curl -s http://127.0.0.1:8003/v1/models | jq


and copy the exact .data[0].id, then set your OpenClaw model id to that exact string.


Also: don’t set this as imageModel

That gguf is not a vision model. Set it to text only and don’t point imageModel at it:

#
models: {
  mode: "merge",
  providers: {
    llamacpp: {
      baseUrl: "http://127.0.0.1:8003/v1",
      apiKey: "llamacpp",
      api: "openai-completions",
      models: [
        {
          id: "<EXACT_ID_FROM_/v1/models>",
          name: "Qwen (llama.cpp)",
          reasoning: false,
          input: ["text"],
          cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
          contextWindow: 8192,
#
          maxTokens: 4096
        }
      ]
    }
  }
},
agents: {
  defaults: {
    model: { primary: "llamacpp/<EXACT_ID_FROM_/v1/models>" }
    // remove imageModel for now
  }
}

Docs reference: https://docs.openclaw.ai/gateway/local-models (custom OpenAI-compatible endpoints)

#

If you paste the exact 404 line from openclaw logs --tail 100 (it usually includes the URL path), I can tell you immediately whether it’s failing on /v1/chat/completions vs “unknown model id”.

hollow rapids
#

curl to v1/chat/completions works

spring totemBOT
#

If /v1/chat/completions works via curl, OpenClaw can talk to that server — so the 404 is almost certainly one of these:

1) Model id mismatch (most common)

Your config uses:

"id": "qwen3.5-35b"


but your llama.cpp /v1/models is advertising an id based on the GGUF name (looks like Qwen3.5-27B-Q4_K_M... in what you pasted).

OpenClaw sends exactly the configured id as the model field. If the server doesn’t know that id, you’ll get an error (often 404/400 depending on server).

Fix: grab the exact id and use it in OpenClaw:

curl -s http://127.0.0.1:8003/v1/models | jq -r '.data[].id'

Then set:

  • models.providers.llamacpp.models[0].id = "<EXACT_ID>"
#
  • and agents.defaults.model.primary = "llamacpp/<EXACT_ID>"

2) You’re claiming vision support (but llama.cpp usually isn’t)

You set:

input: ["text","image"]
imageModel.primary: "llamacpp/qwen3.5-35b"

If OpenClaw ever tries to do image understanding (attachments / image tool), it will send image payloads to llama.cpp, which typically breaks.

Fix for now:

  • change input to ["text"]
  • remove agents.defaults.imageModel (or point it to a real vision model provider)

3) Sanity check: make sure OpenClaw isn’t calling /v1/responses

If api accidentally ends up as openai-responses, OpenClaw will hit /v1/responses and llama.cpp will 404.

#

Confirm what OpenClaw is actually using:

openclaw config get models.providers.llamacpp.api
openclaw config get models.providers.llamacpp.baseUrl

After any change: restart the gateway (openclaw gateway restart).

If it still 404s after fixing the model id, paste the single log line that contains the 404 URL from:

openclaw logs --tail 200

That will tell us exactly which path is missing.

hollow rapids
#

base url is for dashboard in llamacpp

spring totemBOT
#

Yep — that’s the key distinction:

  • the llama.cpp dashboard URL is just the web UI
  • OpenClaw’s models.providers.llamacpp.baseUrl must point at the OpenAI-compatible API base, because OpenClaw will call:
    • GET <baseUrl>/models
    • POST <baseUrl>/chat/completions

So if your curl to http://127.0.0.1:8003/v1/chat/completions works, then the baseUrl format is fine.

At that point the usual remaining cause of a “404 when OpenClaw calls it” is:

A) OpenClaw is sending a model id the server doesn’t know

You configured id: "qwen3.5-35b", but llama.cpp is likely exposing a different data[].id (often the GGUF filename).

Fix: set your OpenClaw model id to match llama.cpp exactly:

#
curl -s http://127.0.0.1:8003/v1/models | jq -r '.data[].id'

Then update:

  • models.providers.llamacpp.models[0].id = "<THAT EXACT STRING>"
  • agents.defaults.model.primary = "llamacpp/<THAT EXACT STRING>"

(Alternative: configure llama.cpp to expose an alias that equals qwen3.5-35b, so you don’t have to use the long GGUF id.)

B) Don’t use llama.cpp as imageModel (for now)

Unless you’re sure it supports image inputs, remove:

agents.defaults.imageModel.primary

and set the model’s input to ["text"]. Otherwise attachment/image flows can blow up in weird ways.

#

C) Make sure the URL is reachable from the gateway host

If OpenClaw is running on a different machine/container, 127.0.0.1 points to that machine, not your llama.cpp box. In that case use your LAN/Tailscale IP in baseUrl.

If you paste the single OpenClaw log line that contains the 404 URL (from openclaw logs --tail 200), I can tell you exactly whether it’s failing on “unknown model id” vs hitting the wrong host/path.

hollow rapids
#

04:24:49 debug agent/embedded {"subsystem":"agent/embedded"} embedded run start: runId=65197cb5-1396-4728-a9ef-f5e36a426f21 sessionId=bef82e57-c621-4224-9861-558cd125bc90 provider=llamacpp model=Qwen3.5-27B-Q4_K_M.gguf thinking=off messageChannel=webchat
04:24:49 debug diagnostic {"subsystem":"diagnostic"} run registered: sessionId=bef82e57-c621-4224-9861-558cd125bc90 totalActive=1
04:24:49 debug agent/embedded {"subsystem":"agent/embedded"} embedded run prompt start: runId=65197cb5-1396-4728-a9ef-f5e36a426f21 sessionId=bef82e57-c621-4224-9861-558cd125bc90
04:24:49 debug agent/embedded {"subsystem":"agent/embedded"} embedded run agent start: runId=65197cb5-1396-4728-a9ef-f5e36a426f21
04:24:49 warn agent/embedded {"subsystem":"agent/embedded"} embedded run agent end: runId=65197cb5-1396-4728-a9ef-f5e36a426f21 isError=true error=HTTP 404: File Not Found
04:24:49 debug agent/embedded {"subsystem":"agent/embedded"} embedded run prompt end: runId=65197cb5-1396-4728-a9ef-f5e36a426f21 sessionId=bef82e57-c621-4224-9861-558cd125bc90 durationMs=8
04:24:49 debug diagnostic {"subsystem":"diagnostic"} run cleared: sessionId=bef82e57-c621-4224-9861-558cd125bc90 totalActive=0
04:24:49 debug agent/embedded {"subsystem":"agent/embedded"} embedded run done: runId=65197cb5-1396-4728-a9ef-f5e36a426f21 sessionId=bef82e57-c621-4224-9861-558cd125bc90 durationMs=39 aborted=false
04:24:49 debug diagnostic {"subsystem":"diagnostic"} lane task done: lane=main durationMs=40 active=0 queued=0
04:24:49 debug diagnostic {"subsystem":"diagnostic"} lane task done: lane=session:agent:main:main durationMs=42 active=0 queued=0

spring totemBOT
#

That log line is super useful — you’re past the “wrong model id” problem (OpenClaw is already using model=Qwen3.5-27B-Q4_K_M.gguf).

At this point, a very common cause of llama.cpp returning HTTP 404: File Not Found (while your manual curl to /v1/chat/completions works) is URL joining: OpenClaw/pi-ai may be hitting:

  • http://127.0.0.1:8003/chat/completions ← missing /v1 → llama.cpp returns 404

You can confirm by trying:

curl -i http://127.0.0.1:8003/chat/completions

(if that returns the same 404, we found it)

Fix (workaround): add trailing slash to baseUrl

Change:

baseUrl: "http://127.0.0.1:8003/v1"
#

to:

baseUrl: "http://127.0.0.1:8003/v1/"

Then restart the gateway:

openclaw gateway restart

Also: don’t set llama.cpp as imageModel yet

Unless you’re running an actual vision model behind that endpoint, remove:

agents.defaults.imageModel

and set the model input to ["text"] (otherwise image/attachment flows can break later).

#

If after adding the trailing slash it still 404s, paste the output of:

openclaw models status

(it prints the resolved provider baseUrl + api mode OpenClaw thinks it’s using).

hollow rapids
#

Config : ~/.openclaw/openclaw.json
Agent dir : ~/.openclaw/agents/main/agent
Default : llamacpp/Qwen3.5-27B-Q4_K_M.gguf
Fallbacks (0) : -
Image model : -
Image fallbacks (0): -
Aliases (0) : -
Configured models (0): all

Auth overview
Auth store : ~/.openclaw/agents/main/agent/auth-profiles.json
Shell env : off
Providers w/ OAuth/tokens (0): -

  • llamacpp effective=models.json:ll...al | models.json=ll...al | source=models.json: ~/.openclaw/agents/main/agent/models.json
  • ollama effective=models.json:ol...al | models.json=ol...al | source=models.json: ~/.openclaw/agents/main/agent/models.json

OAuth/token status

  • none

    "llamacpp": {
      "baseUrl": "http://127.0.0.1:8003/v1/",
      "apiKey": "__OPENCLAW_REDACTED__",
      "api": "openai-completions",
      "models": [
        {
          "id": "Qwen3.5-27B-Q4_K_M.gguf",
          "name": "Qwen 3.5 27B (llama.cpp)",
          "api": "openai-completions",
          "reasoning": false,
          "input": [
            "text"
          ],
          "cost": {
            "input": 0,
            "output": 0,
            "cacheRead": 0,
            "cacheWrite": 0
          },
          "contextWindow": 50176,
          "maxTokens": 32768
        }
      ]
    }
    

    }
    },
    "agents": {
    "defaults": {
    "model": {
    "primary": "llamacpp/Qwen3.5-27B-Q4_K_M.gguf"
    },
    "workspace": "/home/techdev/.openclaw/workspace",
    "subagents": {
    "maxConcurrent": 8
    }
    },

still 404 not found

spring totemBOT
#

That openclaw models status output looks sane now (right model id, right baseUrl, no imageModel). So the remaining “404: File Not Found” is almost certainly llama.cpp rejecting something in the OpenAI SDK payload, not the URL/model selection.

OpenClaw → pi-ai uses the official openai Node client, and it sends a few OpenAI-ish fields by default that some llama.cpp builds don’t like:

  • stream_options: { include_usage: true }
  • max_completion_tokens (instead of max_tokens)
  • tool schema includes strict: false

1) Quick isolate with curl (matches what OpenClaw sends)

Try these exactly and see which one triggers 404:

A. streaming + stream_options

curl -N http://127.0.0.1:8003/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
#
    "model":"Qwen3.5-27B-Q4_K_M.gguf",
    "messages":[{"role":"user","content":"hi"}],
    "stream":true,
    "stream_options":{"include_usage":true}
  }'

B. tool schema (strict field)

curl -N http://127.0.0.1:8003/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model":"Qwen3.5-27B-Q4_K_M.gguf",
    "messages":[{"role":"user","content":"hi"}],
    "stream":true,
#
    "tools":[{"type":"function","function":{"name":"ping","description":"x","parameters":{"type":"object","properties":{}},"strict":false}}]
  }'

2) Workaround in OpenClaw: add compat to make payload more “basic”

In your model definition, add:

compat: {
  supportsUsageInStreaming: false,  // removes stream_options
  maxTokensField: "max_tokens",     // avoids max_completion_tokens
  supportsStrictMode: false,        // removes strict:false from tool schema
  supportsStore: false              // avoids store=false
}
#

Example (just the model part):

models: {
  providers: {
    llamacpp: {
      baseUrl: "http://127.0.0.1:8003/v1",
      apiKey: "__REDACTED__",
      api: "openai-completions",
      models: [{
        id: "Qwen3.5-27B-Q4_K_M.gguf",
        name: "Qwen 3.5 27B (llama.cpp)",
        reasoning: false,
        input: ["text"],
        cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
        contextWindow: 50176,
#
        maxTokens: 32768,
        compat: {
          supportsUsageInStreaming: false,
          maxTokensField: "max_tokens",
          supportsStrictMode: false,
          supportsStore: false
        }
      }]
    }
  }
}

Then openclaw gateway restart and retry.

3) If it still 404s: it’s probably tool-calling overall