#Cant get it to work with codex

1 messages Ā· Page 1 of 1 (latest)

tawny stump
#

After claude kicked us out i switched to codex and troubleshooted for 4h and it still says "LLM request failed: DNS lookup for the provider endpoint failed.

LLM request failed: DNS lookup for the provider endpoint failed.

āš ļø Cron job "Evening Sweep" failed: LLM request failed: DNS lookup for the provider endpoint failed."

tulip oasisBOT
#

Oof, that one is usually not a Codex/model problem. It means the gateway host itself failed DNS resolution for the provider hostname, so OpenClaw never reached the LLM endpoint.

Please paste these, all from the same machine where openclaw gateway is running:

openclaw --version
openclaw status --all
openclaw models status --probe

Then reproduce it once and grab the log lines around the failure:

openclaw logs --follow

And also check DNS on that host:

#
getent hosts api.openai.com || nslookup api.openai.com
getent hosts chatgpt.com || nslookup chatgpt.com

The usual culprits are:

  1. Broken DNS on the gateway host / VPS / container
  2. Bad custom baseUrl for the provider
  3. Proxy / VPN / firewall / egress blocking name resolution

openclaw models status --probe is especially useful here because it also shows the configured provider endpoint when relevant.

Docs:

#

Paste those outputs in one message and I’ll help you narrow it down fast.

tawny stump
#

Here’s a condensed version for the helper agent:

Host DNS does not look like the main issue.

Verified on the live Mac:

openclaw --version → OpenClaw 2026.4.14 (323493f)
openclaw status --all → gateway reachable, Telegram healthy, WhatsApp disabled
nslookup api.openai.com and nslookup chatgpt.com both resolve normally
openclaw models status --probe --json shows short live probes succeeding for:
openai-codex/gpt-5.2
google/gemini-2.0-flash
But real failing runs still show:

LLM request failed: DNS lookup for the provider endpoint failed
with rawError=<html> ...
and nearby runs also show āš ļø API rate limit reached
Conclusion:

Pure local DNS failure is unlikely
OpenClaw is probably misclassifying intermittent openai-codex upstream edge/challenge/rate-limit responses as DNS failures
Telegram is not the root problem
Gateway health is not the root problem

tulip oasisBOT
#

Yeah, with that data I would stop blaming host DNS.

Short probes succeeding + nslookup working + failures carrying rawError=<html>... + nearby API rate limit reached all point much more toward an upstream HTML/challenge/rate-limit response being rewritten badly.

I checked the source, and the current sanitizer really does have a broad DNS fallback, so your misclassification theory is plausible.

What I’d do next:

  1. Treat this as a bug
    File it here: https://github.com/openclaw/openclaw/issues

    Include:

    • openclaw --version (2026.4.14)
    • one successful models status --probe --json
    • one failed real run log excerpt
    • the first redacted part of rawError=<html>...
    • whether the failing model is openai-codex/*
#
  • whether transport is auto, sse, or websocket
  1. Use a workaround for now
    If the cron is important, point it at a provider/model path that is already probing cleanly for you, instead of openai-codex/*, until this is fixed.

  2. Try forcing transport
    Docs support per-model transport override for openai-codex/*, so try sse first if you’re on auto:

    {
      agents: {
        defaults: {
          models: {
            "openai-codex/<your-model>": {
              params: { transport: "sse" }
            }
    
tawny stump
#

I filed the bug report here:
https://github.com/openclaw/openclaw/issues/67712

I also applied the live workaround on your Mac:

forced openai-codex/gpt-5.2 to use transport: "sse"
restarted the gateway
Current live model config now shows:

{
"openai-codex/gpt-5.2": {
"params": {
"transport": "sse"
}
}
}
One note: while testing, a stale session lock cropped up again on the active Telegram session, so I didn’t treat one more immediate live run as trustworthy signal yet. The config change itself is in place and the gateway has been restarted cleanly.

Next best move is simple:

send one fresh Telegram test message
if it still fails, we now have:
the filed bug
the sse workaround already applied
stronger evidence that the remaining problem is not plain host DNS
Other files present: untracked Peekaboo/, untouched.

tulip oasisBOT
#

Nice, that’s the right move.

At this point I’d treat the old failures as noisy and do one fresh Telegram test from a fresh session. If the stale lock shows up again, use /new first so you get a clean signal.

Then:

The important part is you’ve already narrowed it down well:

  • bug filed
  • SSE workaround applied
  • gateway restarted
  • ā€œplain host DNS is brokenā€ looks very unlikely now

If you paste the next fresh failure snippet here, I can help read whether it’s still the same HTML/misclassification bug or the stale-session path instead.

tawny stump
#

What the newest logs show:

Telegram is still healthy and polling.
There are no active session lock files now, so this was not the stale-lock problem this time.
A fresh run started and repeatedly hit openai-codex/gpt-5.2.
That run failed in the same pattern:
LLM request failed: DNS lookup for the provider endpoint failed
rawErrorPreview is still HTML
one retry also flipped into āš ļø API rate limit reached
So this fresh test is a clean signal:

/new worked
Telegram transport worked
no lock issue this time
the remaining blocker is still the same Codex HTML/challenge/rate-limit path being mislabeled as DNS
The sse workaround did not clearly fix it.

#

LLM request failed: DNS lookup for the provider endpoint failed

tulip oasisBOT
#

Yep, that’s a clean repro now. I wouldn’t burn more time on host DNS, Telegram, or stale locks.

What you’ve proven:

  • /new worked
  • Telegram transport is fine
  • no lock issue this time
  • the failure stays on the openai-codex/* path
  • raw error is still HTML, and one retry flips to rate-limit

So the practical next move is:

  1. Update the GitHub issue with this fresh post-restart result
    https://github.com/openclaw/openclaw/issues/67712

  2. Use a temporary workaround for real work / cron runs
    My recommendation: move them off openai-codex/* for now.

#
  1. If you specifically want Codex, try the native Codex harness path
    That is a different runtime path from openai-codex/*:
    • openai-codex/* = Codex OAuth through the normal OpenClaw/PI path
    • codex/* = bundled Codex provider + Codex harness

Docs:

So, short version: this looks like an openai-codex/* runtime bug, not a DNS problem. If you want, I can help you with the smallest config change to switch just the cron over to a safer path while the bug gets looked at.

#

Yep, that’s the same mislabeled failure, not a new DNS clue.

At this point I’d stop spending cycles retrying openai-codex/* and do one of these:

  1. Recommended: switch the affected cron/session to a non-Codex provider for now
  2. If you want Codex specifically: move to the native codex/* harness path instead

Issue is already filed, so now it’s mostly about getting you unblocked.

If you paste the current cron/agent model config, I can give you the smallest safe change.

tawny stump
#

yes helo me

tulip oasisBOT
#

Yep, let’s do the smallest safe change: move only the Evening Sweep cron job off openai-codex/* for now.

Run this on the gateway host:

openclaw cron list

Find the job id for Evening Sweep, then switch just that job to a model that already probed cleanly on your box, for example:

openclaw cron edit <jobId> --model "google/gemini-2.0-flash"
openclaw cron run <jobId>
openclaw cron runs --id <jobId> --limit 10

If that exact model is not allowed in your config, check what is allowed first:

fast heath
#

Hello. I am a Senior Software Engineer. I think I can help you resolve your issue. If you have any questions or need assistance with project development, please contact me via DM.

tulip oasisBOT
#

Appreciate the offer, but please keep help in this thread instead of DMs.

That way:

  • others can sanity-check the advice
  • the fix stays searchable
  • Answer Overflow can index the solution for future people

If you have a concrete next step or diagnosis, drop it here.

#

Yep, that’s the smoking gun.

That is plain HTML coming back from the provider path, not a real DNS failure. So the bug report is on the right track.

I’d do two things now:

  1. Add this exact snippet to the GitHub issue
    It’s much stronger than the earlier summary because it shows the raw HTML page directly.

  2. Switch the cron off openai-codex/* if you need it working today

    openclaw cron list
    openclaw cron edit <jobId> --model "<working-non-codex-model>"
    openclaw cron run <jobId>
    

So at this point the diagnosis is basically:

#
  • not host DNS
  • not Telegram
  • not stale session locks
  • upstream HTML/error page on the openai-codex/* path, mislabeled as DNS

If you paste openclaw cron list, I’ll help you write the exact cron edit command.

tawny stump
#

LLM request failed: DNS lookup for the provider endpoint failed.

<html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style global>body{font-family:Arial,Helvetica,sans-serif}.container{align-items:center;display:flex;flex-direction:column;gap:2rem;height:100%;justify-content:center;width:100%}@keyframes enlarge-appear{0%{opacity:0;transform:scale(75%) rotate(-90deg)}to{opacity:1;transform:scale(100%) rotate(0deg)}}.logo{color:#8e8ea0}.scale-appear{animation:enlarge-appear .4s ease-out}@media (min-width:768px){.scale-appear{height:48px;width:48px}}.data:empty{display:none}.data{border-radius:5px;color:#8e8ea0;text-…

āš ļø Cron job "Evening Sweep: All Synthetic Containers" failed: <html>
<head>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style global>body{font-family:Arial,Helvetica,sans-serif}.container{align-items:center;display:flex;flex-direction:column;gap:2rem;height:100%;justify-content:center;width:100%}@keyframes enlarge-appear{0%{opacity:0;transform:scale(75%) rotate(-90deg)}to{opacity:1;transform:scale(100%) rotate(0deg)}}.logo{color:#8e8ea0}.scale-appear{animation:enlarge-appear .4s ease-out}@media (min-width:768px){.scale-appear{height:48px;width:48px}}.data:empty{display:none}.data{border-radius:5px;color:#8e8ea0;text-…

tulip oasisBOT
#

Same conclusion, yep, this is HTML from the provider path, not DNS.

Please add that exact snippet to https://github.com/openclaw/openclaw/issues/67712, then run:

openclaw cron list

Paste that output here and I’ll give you the exact openclaw cron edit ... --model ... command for the Evening Sweep job.

tawny stump
#

embedded run agent end: ... provider=openai-codex error=LLM request failed: DNS lookup for the provider endpoint failed. rawError=<html> <head> <meta name="viewport" content="width=device-width, initial-scale=1" /> ...

what else have we not tried

tulip oasisBOT
#

At this point, I think the worthwhile untried paths are:

  1. Recommended, fastest unblock: move the cron/job off openai-codex/*

    • We’ve already ruled out DNS, Telegram, stale locks, and the SSE workaround.
    • If you need the job working today, this is the practical move.
  2. If you want to stay on Codex: try the native codex/* harness

  3. If you have an OpenAI API key: test the same job on openai/*

    • If openai/* works and openai-codex/* fails, that isolates it even harder to the Codex OAuth path.

Low-value thing we haven’t explicitly tried:

  • forcing transport: "websocket" instead of auto/sse
  • I’m not excited about this one, because auto usually hits websocket first anyway
#

So my recommendation is:

  • for working now: switch the cron off openai-codex/*
  • for diagnosis: try either codex/* harness or openai/*

If you want, I can give you the exact minimal command for either:

  • A) switch just the cron, or
  • B) try native Codex harness next.
magic geode
#

I have the same issue !!

tulip oasisBOT
#

If your logs also show rawError=<html>... or an HTML page after the ā€œDNS lookup failedā€ message, it’s probably the same openai-codex/* bug, not real DNS.

Quick checks from the gateway host:

openclaw --version
openclaw models status --probe
openclaw logs --follow

What to look for:

  • openai-codex/* failing
  • raw HTML in the log/error preview
  • maybe a nearby API rate limit reached

Tracked here:
https://github.com/openclaw/openclaw/issues/67712

#

Temporary workaround:

  • move the affected cron/session off openai-codex/* for now

If you paste your log snippet, I can help tell whether it’s the same bug or a different DNS problem.

tawny stump
#

I found the root cause of the native Codex failure: the gateway is trying to spawn plain codex, but the real binary is at ... I’m checking the exact config field for the harness command now so I can wire that path in cleanly.

I found the exact config hook for the Codex harness command. I’m opening the nearby schema context now so I can set the absolute Codex binary path instead of the bare codex command that the gateway can’t find.

tulip oasisBOT
#

Yep, the exact field is:

plugins.entries.codex.config.appServer.command

Minimal config shape:

{
  plugins: {
    entries: {
      codex: {
        enabled: true,
        config: {
          appServer: {
#
            transport: "stdio",
            command: "/Users/haniunleashed/.vscode/extensions/openai.chatgpt-26.409.20454-darwin-arm64/bin/macos-aarch64/codex"
          }
        }
      }
    }
  }
}

The default args are already:

["app-server", "--listen", "stdio://"]
#

So you usually only need to override command, not args.

Quick test alternative, docs-supported:

OPENCLAW_CODEX_APP_SERVER_BIN="/Users/haniunleashed/.vscode/extensions/openai.chatgpt-26.409.20454-darwin-arm64/bin/macos-aarch64/codex"

Docs:

One caution: that VS Code extension path is brittle and may change on extension update, so long-term a symlink or putting codex on PATH is cleaner.

tawny stump
#

do it

tulip oasisBOT
#

I can’t write to your machine from this thread, but this is the exact change to run on the gateway host:

openclaw config set plugins.entries.codex.enabled true
openclaw config set plugins.entries.codex.config.appServer.transport "stdio"
openclaw config set plugins.entries.codex.config.appServer.command "/Users/haniunleashed/.vscode/extensions/openai.chatgpt-26.409.20454-darwin-arm64/bin/macos-aarch64/codex"
openclaw gateway restart

Then verify:

openclaw config get plugins.entries.codex.config.appServer --json
openclaw logs --follow

If you want to be explicit about the args too:

#
openclaw config set plugins.entries.codex.config.appServer.args '["app-server","--listen","stdio://"]' --strict-json

Docs:
https://docs.openclaw.ai/plugins/codex-harness

One heads-up: that VS Code extension path may change on update, so this is a good test fix, but a symlink or PATH-based setup is cleaner long-term.

magic geode
#

@tawny stump Hey!! I think I just got it fixed after trying to figure it for 3 days

tulip oasisBOT
#

Nice, please post the exact fix here in the thread.

That’ll help hani immediately, and it keeps the solution searchable for everyone else too.

magic geode
#

openclaw config set models.providers.openai-codex '{
Ā Ā "api": "openai-codex-responses",
Ā Ā "baseUrl": "https://chatgpt.com/backend-api",
Ā Ā "models": [
Ā Ā Ā Ā {
Ā Ā Ā Ā Ā Ā "id": "gpt-5.4",
Ā Ā Ā Ā Ā Ā "name": "gpt-5.4",
Ā Ā Ā Ā Ā Ā "api": "openai-codex-responses"
Ā Ā Ā Ā }
Ā Ā ]
}' --strict-json

openclaw gateway restart

tawny stump
#

omg i think i also am finally getting somewhere - my bot just responded but im not sure if it's final!!!

#

@magic geode

tulip oasisBOT
#

Thanks for posting the exact command, that’s useful.

One important heads-up though: this is a heavy override of models.providers.openai-codex, and the docs explicitly warn that old transport settings there can shadow the built-in Codex OAuth path.

So I’d treat this as a temporary workaround, not the preferred long-term fix.

A couple of caveats for anyone trying it:

  • it hard-overrides the built-in openai-codex provider behavior
  • it pins the catalog to the model(s) you list there
  • once the upstream bug is fixed, you’ll probably want to remove it again

Docs / warning:

If this really fixed it on your machine, can you confirm:

#
  1. did it fix both normal runs and cron runs?
  2. were you already authenticated with Codex OAuth before applying it?

That would help a lot.

#

Ohhh nice, that’s promising šŸŽ‰

I’d call it real once these all pass:

  1. one more normal reply
  2. one fresh /new session reply
  3. one manual cron run for Evening Sweep
  4. logs stop showing rawError=<html>...

If those hold, then yeah, you’ve probably got the real workaround and not a fluke.

If you want, paste the exact change that’s currently in place and I’ll sanity-check it.

magic geode
tawny stump
#

no i used something else

magic geode
#

My bot is replying to me for the first time on my Mac 🤣 ah such a relief

tulip oasisBOT
#

If it keeps working, please paste the exact change here.

That’s the part that’ll help everyone else stuck on the same Codex/DNS-looking bug.

#

Love that šŸ˜‚

Please drop the exact fix/commands you used once you confirm it’s stable, that’ll save the next person a lot of pain.

tawny stump
#

ugh just timed out @magic geode @tulip oasis

magic geode
tulip oasisBOT
#

Ugh, that’s frustrating.

A timeout means it’s not cleanly fixed yet, but it also might be a different failure mode from the HTML/DNS-looking one.

Please paste the new log snippet around the timeout, especially:

  • provider/model
  • error=...
  • rawError=... if present

If it’s still openai-codex/*, I honestly wouldn’t trust that path yet. The two practical options are still:

  1. switch the failing job/session off openai-codex/*, or
  2. try the native codex/* harness path instead

Paste the timeout snippet and I’ll help tell which bucket this one belongs to.

#

So, yes, you can try re-auth once, but I’d still want to see the new timeout rawError before trusting that as the root cause.

magic geode
#

And restart gateway

tulip oasisBOT
#

Yep, after re-login, restart the gateway too so it reloads auth/config cleanly:

openclaw gateway restart

I’d do it in this order:

  1. openclaw models auth login --provider openai-codex
  2. openclaw gateway restart
  3. test one fresh message
  4. if it still fails, grab the new rawError snippet
tawny stump
#

What the wrapper does:

looks for installed Codex binaries under your VS Code extension folders
picks the newest matching openai.chatgpt-*-darwin-arm64 install
execs that Codex binary with whatever args OpenClaw passes
So normal VS Code extension version bumps should no longer break OpenClaw just because the versioned folder name changed.

I also restarted the gateway and verified it works:

gateway healthy
Telegram healthy
direct OpenClaw native Codex test succeeded and returned future-proof-ok
What is still not an absolute guarantee:

if OpenAI or the VS Code extension changes the Codex binary layout or naming pattern entirely, the wrapper would need a small update
but ordinary extension updates with a new versioned folder should now be fine
So: not magically permanent against all upstream packaging changes, but yes, it is substantially hardened and should survive routine updates now.

#

works now

#

thanks guys!

tulip oasisBOT
#

That sounds way more solid than hardcoding one versioned extension path, nice work šŸŽ‰

The important part is:

  • native Codex test now works
  • gateway restarted cleanly
  • wrapper should survive normal VS Code extension version bumps

Please post the exact wrapper script + the exact OpenClaw config line you used, so other people can copy it.

And if this fixed it, please right-click the solution message → Apps → āœ… Mark Solution.
That helps Answer Overflow index it for the next person.

#

Hell yes šŸŽ‰ glad you got it working.

If you haven’t already, please:

  1. post the exact final fix in-thread
  2. right-click the solution message → Apps → āœ… Mark Solution

That helps Answer Overflow index it for the next person.

#

Glad it’s working now šŸ™Œ

If one of the messages above was the fix, please right-click it → Apps → āœ… Mark Solution so it gets indexed for future people.

tawny stump
#

Answer Overflow
APP
— 1:55 AM
The application did not respond
Only you can see this • Dismiss message
Answer Overflow
APP
— 1:55 AM
The application did not respond
Only you can see this • Dismiss message