#Self hosted firecrawl stopped working after 4.9 update.

23 messages · Page 1 of 1 (latest)

loud isle
#

Hey Krill!

First of all, I missed you so much. I recently updated the openclaw to 2026.4.9 and the websearch and webfetch tools stopped working as they were configured to use a self hosted firecrawl. The agent tells me that the configuration requires the baseUrl to always use https. I guess it also has to use the official firecrawl api url?

In a closed system, it should be fine to use normal http. It would be annoying to generate the ssl certificates. What should I do?

hollow siloBOT
lyric latchBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

maiden wraithBOT
#

Let me verify the exact doc language and whether this was intentionally locked down in code.

#

I found the lock in both docs and runtime. Now I’m checking the cleanest self-hosted replacement path so I don’t give you bogus config keys.

#

Yep, I checked the docs and runtime, and this changed intentionally in 2026.4.9.

Short version:

  • Self-hosted Firecrawl over http:// is no longer supported
  • OpenClaw now hard-locks Firecrawl to:
    • https:// only
    • host = api.firecrawl.dev only

So yes, for the built-in Firecrawl integration, it’s effectively official Firecrawl API only now.

What I’d do instead:

  1. For self-hosted search, switch web_search to SearXNG
    • OpenClaw supports self-hosted SearXNG
    • http://localhost:8888 is allowed for trusted private hosts
#
  1. For web_fetch, just use the normal built-in fetch
    • it works without Firecrawl
    • remove provider: "firecrawl" from tools.web.fetch
    • for JS-heavy pages, use the browser tool instead

Example config shape:

{
  tools: {
    web: {
      search: {
        provider: "searxng",
      },
      fetch: {
        enabled: true,
#
        readability: true
        // no provider: "firecrawl"
      },
    },
  },
  plugins: {
    entries: {
      searxng: {
        config: {
          webSearch: {
            baseUrl: "http://localhost:8888",
          },
        },
      },
    },
#
  },
}

So the practical options are:

  • Want fully self-hosted? Use SearXNG + normal web_fetch
  • Want Firecrawl-specific scraping/anti-bot stuff? Use the hosted Firecrawl API
  • Want self-hosted Firecrawl specifically? Right now that means patching OpenClaw or pinning an older version, which I would not recommend long-term

Docs:

If you want, paste your current web config and I’ll rewrite it to the working SearXNG version.

loud isle
#

That's not good. Locking a feature behind a paid option while self hosting is available is bad practice. which github issues or PRs can be tracked for a fix for this issue?

maiden wraithBOT
#

I’m checking the repo history plus GitHub issue/PR search so I can give you real links, not guesses.

#

Yeah, I’m with you, that’s a pretty rough restriction.

I checked, and I didn’t find a dedicated public issue/PR specifically for “bring back self-hosted Firecrawl on private HTTP/HTTPS hosts.” The closest things to track are:

And the current hard lock is in code here:

#
  • extensions/firecrawl/src/firecrawl-client.ts
    • ALLOWED_FIRECRAWL_HOSTS = new Set(["api.firecrawl.dev"])
    • rejects non-https

So honestly, if you want this fixed, the right move is probably a new GitHub issue. I can’t open it for you, but I’d suggest a title like:

[Feature]: allow self-hosted Firecrawl for trusted private hosts (HTTP loopback/private-network or custom HTTPS host)

Include:

  • openclaw --version
  • your old working self-hosted Firecrawl setup
  • that it broke on 2026.4.9
  • that SearXNG already allows private http:// hosts, so Firecrawl could use a similar trust model
  • pointer to extensions/firecrawl/src/firecrawl-client.ts

If you want, I can help you draft the exact GitHub issue text so you can just paste it.

loud isle
#

what does this mean

Firecrawl is also available as an onboard/configure search provider with explicit firecrawl_search and firecrawl_scrape tools

It was written by one of the maintainers on one of the issues.

maiden wraithBOT
#

It basically means:

  • Firecrawl is built into OpenClaw
  • you can use it in two different ways

1) As the normal web search provider

During onboarding or:

openclaw configure --section web

you can pick Firecrawl as the provider behind the generic web_search tool.

So the agent calls:

#
web_search(...)

and OpenClaw routes that to Firecrawl under the hood.

2) As explicit Firecrawl-only tools

OpenClaw also exposes provider-specific tools:

  • firecrawl_search
  • firecrawl_scrape

Those are for when you want Firecrawl-specific behavior directly, instead of the generic web_search / web_fetch interface.

What it does not mean

It does not mean:

#
  • self-hosted Firecrawl is supported
  • arbitrary baseUrl values are supported
  • http://firecrawl.local is supported in current versions

So that maintainer comment is about tool availability / integration style, not about self-hosting policy.

In plain English:
“Firecrawl is a first-class built-in integration”
not
“any Firecrawl-compatible server is allowed.”

Docs:

If you want, I can also help translate that maintainer comment into “what this means for my config” in one concrete example.

bold jungle
#

I have the same issue 🙁

#

@loud isle ^

loud isle
#

@maiden wraith What's the current state on this issue? There has been a bunch of releases since our last message. Although, the PR I am tracking hasn't been merged yet. Has this been resolved independently?

#

If it's not resolved which openclaw version is not affected by this issue? What kind of backward changes I need to do to downgrade from 2024.3.24 to it?

#

Actually, nevermind my last request. V2026.3.24 already has the working state. I am instead experiencing a different issue with that version where most cron jobs and subagents complete but error on message delivery with "Outbound is not configured for telegram".

If you have a clue on why this happens and which versions are affected. let me know.