#llm-task timeout

1 messages · Page 1 of 1 (latest)

lyric apex
#

I am getting a timeout error in the logs

12:26:34
error
[tools] lobster failed: workflow command failed (1): sh -c 'node /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js 2> /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt' raw_params={"action":"run","pipeline":"/home/admin/.openclaw/workspace-dba-scout/sql-leads-draft.lobster","timeoutMs":600000,"maxStdoutBytes":512000}
12:26:34
error
[tools] lobster failed: workflow command failed (1): sh -c 'node /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js 2> /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt' raw_params={"action":"run","pipeline":"/home/admin/.openclaw/workspace-dba-scout/sql-leads-draft.lobster","timeoutMs":600000,"maxStdoutBytes":512000}

inner chasmBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

cinder basaltBOT
# lyric apex I am getting a timeout error in the logs 12:26:34 error [tools] lobster failed:...

Rex Horizon, first thing: can you share openclaw --version? Version mismatches matter a lot for tool/plugin behavior.

From that log, the outer Lobster call already has timeoutMs:600000 (10 minutes), so there are two likely places to check:

  1. Look at the redirected stderr file:
sed -n '1,200p' /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt
  1. Tail the gateway logs around that timestamp:
openclaw logs --follow

If the actual error says lobster timed out, increase the Lobster tool timeoutMs or split the workflow. If the stderr points at llm-task, that tool has its own timeoutMs too, and the default shown in docs is much shorter (30s in config examples), so pass a larger timeoutMs in the openclaw.invoke --tool llm-task ... --args-json payload or set the plugin config.

Docs:

cinder basaltBOT
lyric apex
#

The code that is causing it is this, but as you can see it has a different and larger timeout https://cl1p.net/4560

cinder basaltBOT
# lyric apex The code that is causing it is this, but as you can see it has a different and l...

I can read that cl1p. Important bit: that wrapper script itself does not set a timeout on the HTTP call. It just POSTs whatever came in via stdin as args to /tools/invoke.

So the effective stack is:

  • Lobster outer workflow timeout: the raw_params.timeoutMs: 600000 in your log
  • llm-task timeout: args.timeoutMs if present, otherwise plugins.entries["llm-task"].config.timeoutMs, otherwise 30s
  • Provider/local server behavior underneath that

If your .lobster step is passing a larger timeoutMs inside the JSON sent to this script, then llm-task should receive it. I’d verify that quickly by adding one temporary stderr line before the fetch:

console.error("llm-task timeoutMs:", args.timeoutMs);

Then rerun and check debug-draft-email-stderr.txt.

Also, this script will only show the generic HTTP response from /tools/invoke; the real underlying error is likely in the gateway log as something like tools-invoke: tool execution failed: .... Please check:

cinder basaltBOT
lyric apex
#

Can you show me the code in relation to the fetch

cinder basaltBOT
# lyric apex Can you show me the code in relation to the fetch

Yep. In your script, this is the relevant section around fetch. I’d add the debug line right before the request:

console.error("llm-task args:", JSON.stringify({
  provider: args.provider,
  model: args.model,
  timeoutMs: args.timeoutMs,
  maxTokens: args.maxTokens,
  thinking: args.thinking
}, null, 2));

const res = await fetch(url, {
  method: "POST",
  headers,
  body: JSON.stringify(payload)
});
cinder basaltBOT
# lyric apex Can you show me the code in relation to the fetch

const text = await res.text();

If you also want the wrapper’s own HTTP request to have a timeout, make it explicit like this:

const requestTimeoutMs = Number(process.env.INVOKE_HTTP_TIMEOUT_MS || args.timeoutMs || 600000);

const res = await fetch(url, {
  method: "POST",
  headers,
  body: JSON.stringify(payload),
  signal: AbortSignal.timeout(requestTimeoutMs)
});
cinder basaltBOT
lyric apex
#

It added debug, it still failed

cinder basaltBOT
# lyric apex It added debug, it still failed

Good, that narrows it, but the debug output is the key part now. Please paste the relevant lines from:

sed -n '1,240p' /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt

and the gateway log line at the same timestamp:

openclaw logs --follow

What I’m looking for:

  • Did the debug print show timeoutMs inside args?
  • Did /tools/invoke return an HTTP error body?
  • Did the gateway log say tool execution failed, timed out, LLM returned invalid JSON, or a provider/LM Studio error?
cinder basaltBOT
# lyric apex It added debug, it still failed

Also still need openclaw --version if you haven’t shared it yet. Without the stderr/log text, “still failed” could be the wrapper timeout, the inner llm-task timeout, invalid JSON from the model, model/provider auth, or LM Studio hanging under the request.

lyric apex
#

I'm using 2025.5.12. I don't think the debug file was updated, it has an old timestamp

cinder basaltBOT
# lyric apex I'm using 2025.5.12. I don't think the debug file was updated, it has an old tim...

That old timestamp is useful: it means the code path you edited probably is not the one being executed, or the failing run is not reaching the script far enough to write stderr.

First verify the exact version string, because 2025.5.12 looks like it might be a typo for 2026.5.12:

openclaw --version

Then check whether the file in the log is the file you edited:

ls -l /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js
grep -n "llm-task args\|INVOKE_HTTP_TIMEOUT_MS\|fetch(url" /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js
ls -l /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt

I’d also make the debug impossible to miss by putting this as the very first line inside the process.stdin.on("end", async () => { ... }) handler, before JSON.parse:

cinder basaltBOT
# lyric apex I'm using 2025.5.12. I don't think the debug file was updated, it has an old tim...
console.error("INVOKE_LLM_TASK_MARKER", new Date().toISOString(), "bytes", s.length);

If that marker still does not appear and the stderr timestamp does not change, the workflow is not running that script/path. At that point search the .lobster file for the actual command being run:

grep -n "invoke-llm-task\|debug-draft-email-stderr\|llm-task" /home/admin/.openclaw/workspace-dba-scout/sql-leads-draft.lobster

One more thing: if the failure happens before the shell redirection runs, it should show in gateway logs instead of that debug file, so keep openclaw logs --follow open during the next run.

lyric apex
#

-rw-rw-r-- 1 admin admin 2098 May 15 12:42 /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js
29: console.error("llm-task args:", JSON.stringify({
37: const res = await fetch(url, {
-rw-rw-r-- 1 admin admin 194 May 15 12:26 /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt

cinder basaltBOT
# lyric apex -rw-rw-r-- 1 admin admin 2098 May 15 12:42 /home/admin/.openclaw/workspace-dba-s...

That confirms the debug line is in the script, but the stderr file was not touched. With 2> /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt, the shell should update/truncate that file as soon as the command starts. So this specific command probably did not run after your edit.

Next check the workflow path and whether a different/generated copy is being used:

grep -n "invoke-llm-task\|debug-draft-email-stderr\|llm-task" /home/admin/.openclaw/workspace-dba-scout/sql-leads-draft.lobster
find /home/admin/.openclaw -path '*invoke-llm-task.js' -ls
find /home/admin/.openclaw -path '*debug-draft-email-stderr.txt' -ls

Also re-run the workflow while tailing logs:

openclaw logs --follow

If the same error appears but the stderr file mtime still doesn’t move, then Lobster is failing before launching the node ... invoke-llm-task.js 2> ... command, or a different pipeline/workspace is actually being invoked.

cinder basaltBOT
lyric apex
#

16: command: sh -c 'node /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js 2> /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt'
9056047 4 -rw-rw-r-- 1 admin admin 2098 May 15 12:42 /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js
8913678 4 -rw-rw-r-- 1 admin admin 194 May 15 12:26 /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt

cinder basaltBOT
# lyric apex 16: command: sh -c 'node /home/admin/.openclaw/workspace-dba-scout/lobster-sc...

That line matches, so now I’d test whether the shell redirection + script are actually runnable, without calling the gateway.

Run this exact probe. It feeds invalid JSON, so it should fail before fetch and only write to the debug file:

printf 'not json' | sh -c 'node /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js 2> /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt'
echo exit:$?
cat /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt

Expected result: exit 1, and stderr should include invoke-llm-task.js failed before/during request plus a JSON parse error. If that works, the script and redirection are fine, and the workflow simply has not re-run that step since your edit.

If it doesn’t update the file, then we’re looking at a host/container/path mismatch or permissions/runtime issue.

Also please run these so the timestamps stop lying to us:

cinder basaltBOT
lyric apex
#

exit:1
invoke-llm-task.js failed before/during request
SyntaxError: Unexpected token 'o', "not json" is not valid JSON
at JSON.parse (<anonymous>)
at Socket.<anonymous> (/home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js:9:23)
at Socket.emit (node:events:531:35)
at endReadableNT (node:internal/streams/readable:1698:12)
at process.processTicksAndRejections (node:internal/process/task_queues:89:21)

cinder basaltBOT
# lyric apex exit:1 invoke-llm-task.js failed before/during request SyntaxError: Unexpected t...

Great, that proves the script path and 2> redirection are fine. So the earlier stale file was because the real workflow did not actually run that command after your edit, or it failed before reaching step 16.

Now run the actual Lobster workflow again and watch two things at once:

# terminal 1
openclaw logs --follow
# terminal 2, before and after triggering the workflow
stat /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt

Since your manual probe updated the file, the next real run should overwrite it immediately if step 16 starts. If it does not, the failure is upstream of this script/step.

If step 16 does start, the debug file should now show your llm-task args line. Then we can see whether timeoutMs is actually being passed into llm-task.

cinder basaltBOT
lyric apex
#

llm-task args: {
"provider": "lmstudio",
"model": "lmstudio/qwen/qwen3.6-35b-a3b",
"timeoutMs": 600000,
"maxTokens": 8000,
"thinking": "low"
}
Gateway HTTP request failed
URL: http://127.0.0.1:18789/tools/invoke
Status: 500 Internal Server Error
Raw response:
{"ok":false,"error":{"type":"tool_error","message":"tool execution failed"}}

cinder basaltBOT
# lyric apex llm-task args: { "provider": "lmstudio", "model": "lmstudio/qwen/qwen3.6-35b...

That confirms the timeout is being passed correctly: timeoutMs: 600000 is reaching llm-task. So this is no longer a missing-timeout problem; /tools/invoke is hiding the real exception behind the generic 500.

Two things jump out:

  1. Your model probably has the provider prefix duplicated. Since you’re already passing provider: "lmstudio", try:
{
  "provider": "lmstudio",
  "model": "qwen/qwen3.6-35b-a3b",
  "timeoutMs": 600000,
  "maxTokens": 8000,
  "thinking": "low"
}

not:

cinder basaltBOT
# lyric apex llm-task args: { "provider": "lmstudio", "model": "lmstudio/qwen/qwen3.6-35b...
"model": "lmstudio/qwen/qwen3.6-35b-a3b"

The docs/examples use LM Studio as provider plus model id separately, e.g. /models add lmstudio qwen/qwen3.5-9b.

  1. Check the gateway log for the real reason:
openclaw logs --follow

You should see something near tools-invoke: tool execution failed: .... That line matters; the HTTP response intentionally only says tool execution failed.

Quick isolation test:

cinder basaltBOT
# lyric apex llm-task args: { "provider": "lmstudio", "model": "lmstudio/qwen/qwen3.6-35b...
openclaw infer model run --model lmstudio/qwen/qwen3.6-35b-a3b --prompt "Return {\"ok\":true} as JSON" --json

If that fails too, it’s LM Studio/model config. If that works, try removing thinking: "low" from the llm-task args next, because local/OpenAI-compatible backends often don’t support reasoning controls cleanly.

Relevant local-model runbook: https://docs.openclaw.ai/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail

lyric apex
#

13:24:50
warn
tools-invoke
tool execution failed: Error: LLM returned invalid JSON
13:24:50
error
[tools] lobster failed: workflow command failed (1): sh -c 'node /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js 2> /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt' raw_params={"action":"run","pipeline":"/home/admin/.openclaw/workspace-dba-scout/sql-leads-draft.lobster","timeoutMs":600000,"maxStdoutBytes":512000}
13:24:50
error
[tools] lobster failed: workflow command failed (1): sh -c 'node /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js 2> /home/admin/.openclaw/workspace-dba-scout/debug-draft-email-stderr.txt' raw_params={"action":"run","pipeline":"/home/admin/.openclaw/workspace-dba-scout/sql-leads-draft.lobster","timeoutMs":600000,"maxStdoutBytes":512000}

cinder basaltBOT
# lyric apex 13:24:50 warn tools-invoke tool execution failed: Error: LLM returned invalid JS...

That’s the real error: this is not a timeout anymore. llm-task got a model response, but the response was not parseable as plain JSON.

llm-task is strict here: it strips simple ```json fences, then does JSON.parse(raw). If the model emits anything like ``, prose, markdown, or Here is the JSON:, it fails with LLM returned invalid JSON.

For Qwen/LM Studio, I’d try these in order:

  1. Remove thinking: "low" from the llm-task args.
  2. Add temperature: 0.
  3. Put /no_think at the very start of the prompt if that Qwen build supports it.
  4. Make sure you pass a schema with additionalProperties: false where possible.
  5. Use the model id without duplicating the provider: provider: "lmstudio", model: "qwen/qwen3.6-35b-a3b".

Example args shape:

{
cinder basaltBOT
# lyric apex 13:24:50 warn tools-invoke tool execution failed: Error: LLM returned invalid JS...
  "provider": "lmstudio",
  "model": "qwen/qwen3.6-35b-a3b",
  "timeoutMs": 600000,
  "maxTokens": 8000,
  "temperature": 0,
  "prompt": "/no_think
Return only a single valid JSON object. No markdown, no prose, no <think> tags."
}

If it still fails, the local model is probably not reliable enough for strict JSON mode through llm-task. Quick sanity check:

openclaw infer model run --model lmstudio/qwen/qwen3.6-35b-a3b --prompt '/no_think Return exactly {"ok":true} and nothing else.' --json
cinder basaltBOT
lyric apex
#

ah, I was wtching lmstudio, it stopped a 8000 tokens

cinder basaltBOT
# lyric apex ah, I was wtching lmstudio, it stopped a 8000 tokens

Yep, that explains it. If LM Studio stopped at exactly 8000, the model hit your maxTokens output cap and the JSON was probably truncated before it closed. llm-task then sees partial/non-JSON text and throws LLM returned invalid JSON.

I’d fix it from the output-shaping side first, not by just raising the cap:

  • remove thinking: "low"
  • add temperature: 0
  • start the prompt with /no_think if that Qwen model honors it
  • make the requested JSON much smaller, especially for draft/email fields
  • pass a tight schema so the model has fewer degrees of freedom

Then set maxTokens to something realistic for the expected JSON, e.g. 1200-2500 for a short draft. If it still tries to spend thousands of tokens, that model is not behaving well for strict JSON automation and I’d use a more JSON-reliable model for just this llm-task step.

Raising maxTokens can work as a temporary test, but it mostly hides the symptom and makes the workflow slower/more expensive.

lyric apex
#

I can't see anything in openclaw.jason that corresponds to 8000, apart from this

  "contextPruning": {
    "mode": "cache-ttl",
    "ttl": "10m",
    "keepLastAssistants": 4,
    "softTrimRatio": 0.5,
    "hardClearRatio": 0.75,
    "minPrunableToolChars": 8000,
    "softTrim": {
      "maxChars": 4000,
      "headChars": 1500,
      "tailChars": 1500
    },
    "hardClear": {
      "enabled": true,
      "placeholder": "[Old tool result content cleared]"
    }
  },
cinder basaltBOT
# lyric apex I can't see anything in openclaw.jason that corresponds to 8000, apart from this...

That contextPruning.minPrunableToolChars: 8000 is unrelated. It is a character threshold for trimming old tool results from agent context, not the model output cap.

The 8000 we saw came from your own debug line:

"maxTokens": 8000

That means it is being passed in the JSON args to invoke-llm-task.js, most likely from the .lobster workflow or whatever generated the workflow input. Search the workspace, not just openclaw.json:

grep -RIn 'maxTokens\|8000' /home/admin/.openclaw/workspace-dba-scout

Likely places:

  • /home/admin/.openclaw/workspace-dba-scout/sql-leads-draft.lobster
cinder basaltBOT
# lyric apex I can't see anything in openclaw.jason that corresponds to 8000, apart from this...
  • a JSON/heredoc block piped into invoke-llm-task.js
  • a script that generates the .lobster pipeline

For this specific failure, maxTokens: 8000 is not necessarily bad by itself; it just tells us LM Studio stopped exactly at the requested output limit. The fix is to make the model produce shorter valid JSON, or temporarily raise/remove maxTokens to confirm truncation.

I’d search for it, then try lowering the requested output size in the prompt/schema and removing thinking first.

lyric apex
#

I can't it, it must be a default, how do I override it and make it larger?

cinder basaltBOT
# lyric apex I can't it, it must be a default, how do I override it and make it larger?

Since your debug output showed "maxTokens": 8000, the strongest override is at the llm-task call args. llm-task resolves it like this:

args.maxTokens -> plugins.entries["llm-task"].config.maxTokens -> provider/default behavior

So if args.maxTokens is already being passed as 8000, changing openclaw.json may not win. Override it right before building payload in invoke-llm-task.js:

args.maxTokens = Number(process.env.LLM_TASK_MAX_TOKENS || 16000);
args.timeoutMs = Number(process.env.LLM_TASK_TIMEOUT_MS || args.timeoutMs || 600000);

Put that after your provider/model defaults and before:

cinder basaltBOT
cinder basaltBOT
# lyric apex I can't it, it must be a default, how do I override it and make it larger?
    "entries": {
      "llm-task": {
        "enabled": true,
        "config": {
          "maxTokens": 16000,
          "timeoutMs": 600000
        }
      }
    }
  }
}

But again: config only helps when the request args don’t already contain maxTokens. In your case I’d patch the wrapper/script so the run is deterministic.

One warning: if the model needs more than 8000 output tokens to produce one JSON object, it may be dumping reasoning or runaway text. Larger cap can confirm truncation, but I’d still remove thinking / add /no_think so it returns compact JSON.

lyric apex
#

llm-task args: {
"provider": "lmstudio",
"model": "lmstudio/qwen/qwen3.6-35b-a3b",
"timeoutMs": 600000,
"maxTokens": 8000,
"thinking": "low"
}
Gateway HTTP request failed
URL: http://127.0.0.1:18789/tools/invoke
Status: 500 Internal Server Error
Raw response:
{"ok":false,"error":{"type":"tool_error","message":"tool execution failed"}}
a

cinder basaltBOT
# lyric apex llm-task args: { "provider": "lmstudio", "model": "lmstudio/qwen/qwen3.6-35b...

That output is unchanged, so the override did not take effect. The wrapper is still sending:

"maxTokens": 8000,
"thinking": "low",
"model": "lmstudio/qwen/qwen3.6-35b-a3b"

Put the override before your debug console.error(...), otherwise the debug will keep showing the old args:

args.provider = args.provider || "lmstudio";
args.model = "qwen/qwen3.6-35b-a3b"; // no lmstudio/ prefix here
args.maxTokens = Number(process.env.LLM_TASK_MAX_TOKENS || 16000);
args.timeoutMs = Number(process.env.LLM_TASK_TIMEOUT_MS || args.timeoutMs || 600000);
delete args.thinking;
cinder basaltBOT
cinder basaltBOT
lyric apex
#

This is what I have

cinder basaltBOT
# lyric apex This is what I have

I don’t see the code in the message, just “This is what I have.” Can you paste the numbered output from this?

nl -ba /home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js | sed -n '1,80p'

The line order matters here. The override has to happen after JSON.parse(s) and before both the debug console.error(...) and const payload = ....

lyric apex
cinder basaltBOT
# lyric apex This is what it looks like https://cl1p.net/4560

I can see the issue in that version.

This line does not fix the duplicated provider prefix if the incoming args already contain a model:

args.model = args.model || "lmstudio/qwen/qwen3.6-35b-a3b";

Because your incoming args already have lmstudio/qwen/..., the || keeps it. Change that block to force the values you want:

args.provider = "lmstudio";
args.model = "qwen/qwen3.6-35b-a3b"; // no lmstudio/ prefix here
args.maxTokens = Number(process.env.LLM_TASK_MAX_TOKENS || 16000);
args.timeoutMs = Number(process.env.LLM_TASK_TIMEOUT_MS || args.timeoutMs || 600000);
delete args.thinking;
cinder basaltBOT
# lyric apex This is what it looks like https://cl1p.net/4560
args.temperature = 0;

Also, with the code you pasted, your debug output should show maxTokens: 16000. If it still shows 8000, then the current workflow run is not using this edited file, or you’re looking at output from an older run. The script you pasted cannot print maxTokens: 8000 unless LLM_TASK_MAX_TOKENS=8000 is set in the environment.

Check that quickly with:

env | grep LLM_TASK_MAX_TOKENS

Then rerun and confirm the debug line changes before chasing the model again.

lyric apex
#

I changed eit to 16000, and now I get a different error

llm-task args: {
"provider": "lmstudio",
"model": "lmstudio/qwen/qwen3.6-35b-a3b",
"timeoutMs": 600000,
"maxTokens": 16000,
"thinking": "low"
}
invoke-llm-task.js failed before/during request
TypeError: fetch failed
at node:internal/deps/undici/undici:14976:13
at async Socket.<anonymous> (/home/admin/.openclaw/workspace-dba-scout/lobster-scripts/invoke-llm-task.js:39:17)

cinder basaltBOT
# lyric apex I changed eit to 16000, and now I get a different error llm-task args: { "pro...

fetch failed is different: your wrapper did not get an HTTP response from the gateway at all. That usually means the gateway connection was closed/reset, the gateway restarted/crashed, or 127.0.0.1:18789 was temporarily unreachable during the long call.

First, your args still show two things I’d fix before retesting:

"model": "lmstudio/qwen/qwen3.6-35b-a3b",
"thinking": "low"

Force these in the wrapper, not with ||:

args.provider = "lmstudio";
args.model = "qwen/qwen3.6-35b-a3b";
delete args.thinking;
args.temperature = 0;
cinder basaltBOT
cinder basaltBOT
lyric apex
#

What is the fetch timeout, because I could see the invoke query running in lmstudio, but I could see a second parrellel one which I guess was probing the fetch?

cinder basaltBOT
# lyric apex What is the fetch timeout, because I could see the invoke query running in lmstu...

In the wrapper code you pasted, the fetch has no explicit timeout:

const res = await fetch(url, {
  method: "POST",
  headers,
  body: JSON.stringify(payload)
});

So the timeout you are controlling is mainly the inner llm-task timeout (args.timeoutMs: 600000). Node/undici can still fail the HTTP transport underneath with TypeError: fetch failed if the socket is closed/reset, the gateway restarts, or an internal connection/header timeout is hit. Add the err.cause logging from above; that will usually tell us which one.

If you want to make the wrapper-side HTTP wait explicit, add:

const requestTimeoutMs = Number(process.env.INVOKE_HTTP_TIMEOUT_MS || 900000);
cinder basaltBOT
# lyric apex What is the fetch timeout, because I could see the invoke query running in lmstu...

const res = await fetch(url, {
  method: "POST",
  headers,
  body: JSON.stringify(payload),
  signal: AbortSignal.timeout(requestTimeoutMs)
});

But the second LM Studio request you saw is probably not a “fetch probe.” fetch only calls OpenClaw’s /tools/invoke; it does not probe LM Studio directly. If LM Studio shows two model requests, that second one is coming from OpenClaw/model runtime behavior, a retry/fallback, another agent/session, or a separate health/test command.

To identify it, compare timestamps in:

openclaw logs --follow
cinder basaltBOT
lyric apex
#

Thinking is off for the model by default

cinder basaltBOT
# lyric apex Thinking is off for the model by default

Maybe at the LM Studio/model default level, but your llm-task request is explicitly sending:

"thinking": "low"

So OpenClaw will pass a thinking/reasoning setting into the embedded model run if that provider path accepts it. For this JSON-only workflow I’d still remove it from the request entirely so there’s no ambiguity:

delete args.thinking;

Then confirm the debug output no longer includes thinking. If it still appears, the wrapper edit is not taking effect or the workflow is injecting it after your edit.

Separately, the current hard error is fetch failed, so the most useful next data is still the err.cause from Node and the gateway log at the same timestamp. That will tell us whether this is a socket reset/gateway restart/timeout rather than an LLM JSON issue.

lyric apex
#

is this relevant

14:22:23
warn
diagnostic
stalled session: sessionId=llm-task-1778851024950 sessionKey=agent:main:explicit:llm-task-1778851024950 state=processing age=317s queueDepth=0 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=model_call lastProgress=model_call:started lastProgressAge=317s recovery=none
14:22:53
warn
diagnostic
stalled session: sessionId=llm-task-1778851024950 sessionKey=agent:main:explicit:llm-task-1778851024950 state=processing age=347s queueDepth=0 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=model_call lastProgress=model_call:started lastProgressAge=347s recovery=none

cinder basaltBOT
# lyric apex is this relevant 14:22:23 warn diagnostic stalled session: sessionId=llm-task-1...

Yes, very relevant. That says the embedded llm-task agent is stuck inside the model call, with no progress for 317-347s. So the gateway did launch the task and is waiting on LM Studio/model generation; it is not stuck in Lobster or the wrapper.

Interpretation:

  • activeWorkKind=model_call = waiting on the model backend
  • lastProgress=model_call:started = request started, no streamed progress observed since
  • recovery=none = diagnostic warning only, not auto-killing it

So the real bottleneck is the LM Studio request. The second request you saw may be another run/old run still active, which can make this worse if the backend handles requests in parallel poorly.

I’d stop all current runs/restart the gateway or LM Studio to clear stale model calls, then retry with a smaller controlled request:

args.provider = "lmstudio";
args.model = "qwen/qwen3.6-35b-a3b";
delete args.thinking;
cinder basaltBOT
# lyric apex is this relevant 14:22:23 warn diagnostic stalled session: sessionId=llm-task-1...
args.temperature = 0;
args.maxTokens = 2000;
args.timeoutMs = 600000;

If that works, increase maxTokens gradually. If it stalls even at 2000, it’s an LM Studio/model/backend issue, not a Lobster timeout.

Also check whether LM Studio is streaming tokens for that request. If LM Studio is generating but OpenClaw sees no progress, that points at the local OpenAI-compatible transport/streaming behavior. The local-model troubleshooting doc has the right probes: https://docs.openclaw.ai/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail

lyric apex
#

so although I have not specifed a fetch timeout, is there a fetch timeout value somewhere that can specify and increase

cinder basaltBOT
# lyric apex so although I have not specifed a fetch timeout, is there a fetch timeout value ...

Yes. There are two different timeout layers here:

  1. args.timeoutMs controls the inner llm-task model run.
  2. Your wrapper’s fetch() waits for /tools/invoke to return HTTP response headers.

Because /tools/invoke does not send a response until the tool finishes, Node’s fetch/undici can hit its own headers timeout on long calls. Your stalled-session log at ~317s lines up suspiciously well with a ~300s HTTP headers timeout.

AbortSignal.timeout(...) is useful, but by itself it may not override undici’s headers timeout. For a long blocking /tools/invoke call, use an undici dispatcher with larger headersTimeout/bodyTimeout:

import { Agent } from "undici";

const requestTimeoutMs = Number(process.env.INVOKE_HTTP_TIMEOUT_MS || 900000);
const dispatcher = new Agent({
  headersTimeout: requestTimeoutMs,
  bodyTimeout: requestTimeoutMs
cinder basaltBOT
cinder basaltBOT
lyric apex
#

so what would be the script in full

cinder basaltBOT
# lyric apex so what would be the script in full

Here’s a full drop-in replacement that avoids fetch() entirely and uses http.request() with an explicit request timeout. That sidesteps the undici/fetch headers-timeout problem on long /tools/invoke calls.

const http = require("node:http");
const https = require("node:https");

let s = "";

function postJson(urlString, payload, headers, timeoutMs) {
  return new Promise((resolve, reject) => {
    const url = new URL(urlString);
    const body = JSON.stringify(payload);
    const client = url.protocol === "https:" ? https : http;

    const req = client.request({
      protocol: url.protocol,
cinder basaltBOT
# lyric apex so what would be the script in full
      hostname: url.hostname,
      port: url.port,
      path: `${url.pathname}${url.search}`,
      method: "POST",
      headers: {
        ...headers,
        "content-length": Buffer.byteLength(body)
      }
    }, (res) => {
      let text = "";
      res.setEncoding("utf8");
      res.on("data", chunk => { text += chunk; });
      res.on("end", () => {
        resolve({
          ok: res.statusCode >= 200 && res.statusCode < 300,
cinder basaltBOT
# lyric apex so what would be the script in full
          status: res.statusCode,
          statusText: res.statusMessage || "",
          text
        });
      });
    });

    req.setTimeout(timeoutMs, () => {
      req.destroy(new Error(`HTTP request timed out after ${timeoutMs}ms`));
    });

    req.on("error", reject);
    req.write(body);
    req.end();
  });
cinder basaltBOT
# lyric apex so what would be the script in full
}

process.stdin.on("data", c => {
  s += c;
});

process.stdin.on("end", async () => {
  try {
    const args = JSON.parse(s);

    args.provider = process.env.LLM_TASK_PROVIDER || "lmstudio";
    args.model = process.env.LLM_TASK_MODEL || args.model || "qwen/qwen3.6-35b-a3b";
    if (args.provider === "lmstudio" && args.model.startsWith("lmstudio/")) {
      args.model = args.model.slice("lmstudio/".length);
    }
cinder basaltBOT
# lyric apex so what would be the script in full

    args.maxTokens = Number(process.env.LLM_TASK_MAX_TOKENS || 16000);
    args.timeoutMs = Number(process.env.LLM_TASK_TIMEOUT_MS || args.timeoutMs || 600000);
    args.temperature = Number(process.env.LLM_TASK_TEMPERATURE || 0);

    // JSON-only local model runs are usually more reliable without explicit thinking controls.
    delete args.thinking;

    const requestTimeoutMs = Number(process.env.INVOKE_HTTP_TIMEOUT_MS || args.timeoutMs + 300000);

    const port = process.env.OPENCLAW_GATEWAY_PORT || "18789";
    const baseUrl = process.env.OPENCLAW_GATEWAY_URL || `http://127.0.0.1:${port}`;
    const token = process.env.OPENCLAW_GATEWAY_TOKEN || process.env.SHARED_GATEWAY_TOKEN || "";
    const url = `${baseUrl.replace(/\/$/, "")}/tools/invoke`;
cinder basaltBOT
# lyric apex so what would be the script in full
    const headers = { "content-type": "application/json" };
    if (token) headers.authorization = `Bearer ${token}`;

    const payload = {
      tool: "llm-task",
      action: "json",
      sessionKey: process.env.OPENCLAW_SESSION_KEY || "agent:sql:main",
      args
    };

    console.error("llm-task args:", JSON.stringify({
      provider: args.provider,
      model: args.model,
      timeoutMs: args.timeoutMs,
      maxTokens: args.maxTokens,
cinder basaltBOT
# lyric apex so what would be the script in full
      temperature: args.temperature,
      requestTimeoutMs
    }, null, 2));

    const res = await postJson(url, payload, headers, requestTimeoutMs);
    const text = res.text;

    if (!res.ok) {
      console.error("Gateway HTTP request failed");
      console.error("URL:", url);
      console.error("Status:", res.status, res.statusText);
      console.error("Raw response:");
      console.error(text);
      process.exit(1);
    }
cinder basaltBOT
# lyric apex so what would be the script in full

    let parsed;
    try {
      parsed = JSON.parse(text);
    } catch (err) {
      console.error("Gateway returned non-JSON response");
      console.error("Raw response:");
      console.error(text);
      process.exit(1);
    }

    if (parsed && parsed.ok === false) {
      console.error("llm-task tool returned ok=false");
      console.error(JSON.stringify(parsed, null, 2));
      process.exit(1);
cinder basaltBOT
# lyric apex so what would be the script in full
    }

    process.stdout.write(text);
  } catch (err) {
    console.error("invoke-llm-task.js failed before/during request");
    console.error(err && err.stack ? err.stack : String(err));
    if (err && err.cause) console.error("cause:", err.cause);
    process.exit(1);
  }
});

Then rerun. The debug output should show model: "qwen/qwen3.6-35b-a3b", no thinking, maxTokens: 16000, and requestTimeoutMs around 900000 unless you override it.

lyric apex
#

Failed with this 17:55:24
error
diagnostic
lane task error: lane=session:agent:main:explicit:llm-task-1778864124361 durationMs=628 error="FailoverError: Unknown model: qwen/qwen3.6-35b-a3b"

cinder basaltBOT
# lyric apex Failed with this 17:55:24 error diagnostic lane task error: lane=session:agent:m...

Good catch. That means my earlier “strip lmstudio/ from the model id” advice does not match how your model is registered locally. In your config/catalog, OpenClaw knows the model as:

lmstudio/qwen/qwen3.6-35b-a3b

so keep that exact model string.

Change the wrapper to this instead:

args.provider = process.env.LLM_TASK_PROVIDER || "lmstudio";
args.model = process.env.LLM_TASK_MODEL || "lmstudio/qwen/qwen3.6-35b-a3b";
args.maxTokens = Number(process.env.LLM_TASK_MAX_TOKENS || 16000);
args.timeoutMs = Number(process.env.LLM_TASK_TIMEOUT_MS || args.timeoutMs || 600000);
args.temperature = Number(process.env.LLM_TASK_TEMPERATURE || 0);
cinder basaltBOT
cinder basaltBOT
lyric apex
#

ok, so worked, no fetch timeout this time

cinder basaltBOT
# lyric apex ok, so worked, no fetch timeout this time

Nice, that confirms the blocking HTTP fetch/headers-timeout path was the problem, and the explicit request timeout wrapper fixed it.

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow indexes the thread for the next person who hits long-running llm-task / /tools/invoke calls.

lyric apex
#

where are LLM_TASK_MAX_TOKENS and LLM_TASK_TIMEOUT_MS usuallly configured

cinder basaltBOT
# lyric apex where are LLM_TASK_MAX_TOKENS and LLM_TASK_TIMEOUT_MS usuallly configured

LLM_TASK_MAX_TOKENS and LLM_TASK_TIMEOUT_MS are not built-in OpenClaw config keys. Those are wrapper-specific environment variables from the script we added, so they only exist if you choose to use them.

Usual places to configure this, in order of clarity:

  1. Directly in the llm-task args from the .lobster workflow:
{
  "maxTokens": 16000,
  "timeoutMs": 600000
}
  1. In the llm-task plugin config in openclaw.json, as defaults when args don’t override them:
cinder basaltBOT
cinder basaltBOT
lyric apex
#

actually putting it in the plugins section has no effect

cinder basaltBOT
# lyric apex actually putting it in the plugins section has no effect

That makes sense in your case. Plugin config is only a default; it loses whenever the request args already include the same field.

In llm-task, the effective order is:

args.maxTokens -> plugins.entries["llm-task"].config.maxTokens -> undefined/provider default
args.timeoutMs  -> plugins.entries["llm-task"].config.timeoutMs  -> 30000

Your debug output showed the wrapper/request is already sending maxTokens, so the plugin config will have no effect unless you remove args.maxTokens before calling /tools/invoke.

Use one of these approaches:

A. Keep it controlled in the wrapper

cinder basaltBOT
# lyric apex actually putting it in the plugins section has no effect
args.maxTokens = Number(process.env.LLM_TASK_MAX_TOKENS || 16000);
args.timeoutMs = Number(process.env.LLM_TASK_TIMEOUT_MS || args.timeoutMs || 600000);

B. Let plugin config control it

delete args.maxTokens;
delete args.timeoutMs;

Then restart the gateway after changing openclaw.json, because plugin config is read by the gateway/plugin runtime:

openclaw gateway restart
cinder basaltBOT