#memory_search provider and fallback(s)

1 messages Ā· Page 1 of 1 (latest)

faint prairie
#

Is there a configuration to allow multiple fallbacks and in a specific order? Right now it prioritizes Gemini which doesn't really have a free tier anymore (or I can't find it), and I don't want it using my paid key that I have set for a fallback agent model provider.

wooden heathBOT
hot laurelBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> āœ… Mark Solution

wooden heathBOT
faint prairie
#

Version 2026.4.15

amber anvilBOT
#

First, what does openclaw --version show?

On current builds, no, memory search only supports a single memorySearch.fallback string, not multiple ordered fallbacks.

Also, provider: "auto" uses a fixed internal order, not a configurable one:

local → github-copilot → openai → gemini → voyage → mistral → bedrock

(ollama is supported, but not auto-detected.)

So if you do not want Gemini or any other paid key to get picked up, the safest fix is to stop using auto and pin the provider explicitly, for example:

{
  agents: {
    defaults: {
#
      memorySearch: {
        provider: "local",   // or "ollama", "openai", etc.
        fallback: "none",
      },
    },
  },
}

If you want exactly one fallback, that works too, for example provider: "local" plus fallback: "openai", but there is no multi-fallback chain.

Why Gemini is showing up: memory search reuses provider auth resolution, so if GEMINI_API_KEY or models.providers.google.apiKey exists, Gemini counts as available to auto-selection.

Docs: https://docs.openclaw.ai/reference/memory-config

If you paste your current memorySearch block plus openclaw --version, I can suggest the exact safest config.

#

Nice, thanks, 2026.4.15 matches what I checked, so the same answer applies on your version.

There is no multi-fallback list for memory search, and the auto order is fixed. If you want to prevent Gemini from being used, set an explicit provider and disable fallback, for example:

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "ollama", // or "local", "openai", etc.
        fallback: "none",
      },
    },
  },
}
#

If you want one fallback only, you can do:

memorySearch: {
  provider: "local",
  fallback: "openai",
}

But there is no ["local", "openai", "voyage"] style ordered chain.

Also, auto will treat a configured Gemini key as available, so if models.providers.google.apiKey or GEMINI_API_KEY is present, it can pick Gemini.

Useful reference: https://docs.openclaw.ai/reference/memory-config

If you want, paste your current memorySearch block and I’ll rewrite it into the exact safe config for your setup.

faint prairie
#

which of the non-local providers have free tiers, and which ones do people typically use?

amber anvilBOT
#

Quick snapshot from the current docs I checked, this stuff changes fast:

  • Gemini: yes, Google still shows a free tier for embeddings (gemini-embedding-001), plus paid tiers.
    https://ai.google.dev/gemini-api/docs/pricing
  • Voyage: yes, and honestly this looks like the nicest non-local free tier right now. Their pricing page says the first 200M tokens are free on the main voyage-4* models, and 50M on some others.
    https://docs.voyageai.com/docs/pricing
  • Mistral: yes, they do have a free API tier, but their docs frame it as evaluation/prototyping.
    https://docs.mistral.ai/deployment/ai-studio/tier
  • OpenAI: I would treat it as paid, not something to rely on for a stable free-tier memory setup.
  • Bedrock: effectively paid/AWS, not a free-tier play.
  • GitHub Copilot: Copilot Free exists, but I would not call it the simplest/most predictable choice for OpenClaw memory search unless you already use Copilot heavily.

For ā€œwhat do people typically use?ā€, my rough impression is:

  1. OpenAI, if they just want the boring reliable default
  2. Voyage, if they care specifically about embedding quality/cost
  3. Gemini, if they already have Google set up
#
  1. Bedrock, mostly AWS-heavy setups
  2. Mistral/Copilot, more niche for this use case

So if your goal is cheap non-local memory search, I’d personally look at Voyage first, or Gemini pinned explicitly if you’re okay with Google. I would not leave it on auto.

Also, since you asked about what people actually use in practice, it’s worth asking in #users-helping-users too, that’ll get you real hands-on reports from other users.

If you want, I can give you 3 ready-to-paste configs:

  • cheapest decent
  • best free-ish option
  • most reliable paid setup
faint prairie
#

I also noticed the default fallback chain doesn't work when the failure reason is rate limit. i.e. my gemini has no funds on it

amber anvilBOT
#

Yep, that lines up with how it’s implemented right now.

provider: "auto" is not a general failover chain. It’s more like provider discovery/selection. For Gemini specifically, auto-selection only keeps going on setup-style errors like missing auth, not on quota/rate-limit style failures.

So in practice:

  • missing key / unavailable setup → auto may continue to the next provider
  • 429 / quota exhausted / no funds → it usually does not continue down the auto order

That’s why ā€œGemini has no fundsā€ doesn’t behave like ā€œGemini is unavailableā€.

Also, the explicit memorySearch.fallback is only one fallback, not a multi-hop chain.

So the safe advice today is:

  1. Don’t rely on auto for quota/budget failover
  2. Pin the primary provider explicitly
#
  1. Optionally set one explicit fallback

Example:

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "voyage",
        fallback: "openai", // or "none"
      },
    },
  },
}
#

If you want OpenClaw to treat quota/rate-limit errors as ā€œtry the next providerā€, that sounds like a legit feature request / bug report for GitHub: https://github.com/openclaw/openclaw/issues

If you want, I can suggest a couple of concrete provider pairings depending on whether you want cheapest, best quality, or most predictable behavior.

faint prairie
#

Most straight forward way to install ollama as the fallback?

amber anvilBOT
#

Easiest path is local-only Ollama as the explicit fallback, not auto.

I’d do this:

  1. Install Ollama
    https://ollama.com/download

  2. Make sure it’s running
    Usually the app does this, otherwise:

    ollama serve
    
  3. Set the placeholder key OpenClaw expects for local Ollama

    export OLLAMA_API_KEY="ollama-local"
    
#
  1. Set memory search to use one real primary + Ollama fallback
    {
      agents: {
        defaults: {
          memorySearch: {
            provider: "voyage",   // or openai/gemini/mistral
            fallback: "ollama",
          },
        },
      },
    }
    

That’s enough for the common localhost case. OpenClaw’s Ollama memory provider uses nomic-embed-text by default, and the docs say it will auto-pull the embedding model if it isn’t present yet.

Only if Ollama is on another host

#

Add an explicit provider entry, using the native Ollama URL, not /v1:

{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://127.0.0.1:11434",
      },
    },
  },
}

Important: use http://host:11434, no /v1.

Verify

faint prairie
#

what happens if my main provider and fallback both fail

amber anvilBOT
#

Then it stops there, there isn’t a third hop.

For an explicit setup like:

memorySearch: {
  provider: "voyage",
  fallback: "ollama",
}

OpenClaw will:

  1. try the main provider
  2. try the single fallback
  3. if the fallback also fails, throw a combined error like:
#
<primary failure>

Fallback to ollama failed: <fallback failure>

So, practically:

  • no extra fallback chain
  • the memory/search operation for that attempt fails
  • you get both failure reasons in the error

That behavior is coming from the memory embedding fallback code path directly.

If this helped, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it for future users.

faint prairie
#

so if both fail, memory_search is not usable

#

is VOYAGE_API_KEY the right key name

amber anvilBOT
#

For the semantic/vector part, yes, basically.

If your configured primary and single fallback both fail, OpenClaw does not have another provider to continue with, so the embedding-based memory path is effectively down until one of them works again.

Small nuance, there is an FTS-only degraded mode in the memory code, so plain keyword/text lookup can still exist in some cases, but that is not the same thing as working semantic memory search.

So the practical answer is:

  • semantic memory search: no, not usable
  • possible degraded keyword-only behavior: maybe, depending on state/indexing
  • automatic third fallback: no

If you want robust behavior today, I’d treat it as:
ā€œone primary, one fallback, after that memory search is degraded/unavailable.ā€

Glad we caught that edge case. If this solved the thread, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

#

Yep, for Voyage memory embeddings the env var is VOYAGE_API_KEY.

The config-file equivalent is:

models: {
  providers: {
    voyage: {
      apiKey: "VOYAGE_API_KEY",
    },
  },
}

And if you want Voyage as the memory provider:

faint prairie
#

do i need a placeholder for ollama key even though there isn't one?

amber anvilBOT
#

Not a real key, no, local Ollama itself doesn’t need one.

But for OpenClaw, I’d still set the placeholder:

export OLLAMA_API_KEY="ollama-local"

That’s the recommended low-friction setup in the Ollama docs for local-only / host-backed use. It helps OpenClaw’s provider setup/discovery behave predictably, even though Ollama isn’t actually checking the token.

So the short version is:

  • real secret required? no
  • placeholder recommended for OpenClaw? yes, ollama-local

If you fully configure Ollama explicitly, you might get away without it, but I’d still use the placeholder to avoid weird edge cases.

faint prairie
#

once i have those in my .env i don't need any other openclaw.json changes other than memorysearch.provider and .fallback?

amber anvilBOT
#

Yep, assuming:

  • VOYAGE_API_KEY is in your env
  • OLLAMA_API_KEY=ollama-local is in your env
  • Ollama is local on the default URL (http://127.0.0.1:11434)

then usually the only config you need is under agents.defaults.memorySearch, for example:

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "voyage",
        fallback: "ollama",
      },
#
    },
  },
}

You only need extra config if one of these is true:

  • Ollama is on another host/port
    then set models.providers.ollama.baseUrl
  • you want a non-default Ollama embedding model
    then set memorySearch.model
  • you want custom base URLs / headers / explicit keys
    then use the provider config or memorySearch.remote.*

After changing .env, I’d restart the gateway and verify with:

#
openclaw memory status --deep

If this fixed it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

faint prairie
#

can the agent test memory_search with a specific provider so we can make sure ollama is working as expected?

amber anvilBOT
#

Yes, but there are really 2 different tests:

1. Raw Ollama embeddings smoke test

This checks that Ollama can actually create embeddings:

openclaw infer embedding create \
  --model ollama/nomic-embed-text \
  --text "memory search smoke test" \
  --json

If that works, Ollama embeddings themselves are fine.

2. Real memorySearch test with Ollama

I don’t see a one-off memory status --provider ollama override, so the clean way is to temporarily pin memorySearch to Ollama:

#
{
  agents: {
    defaults: {
      memorySearch: {
        provider: "ollama",
        fallback: "none",
      },
    },
  },
}

Then restart the gateway and run:

#
openclaw memory status --deep

That should show Ollama as the requested/active provider, and whether embeddings are ready.

So my recommendation is:

  • use infer embedding create for a quick provider smoke test
  • use temporary memorySearch.provider = "ollama" + openclaw memory status --deep for the real end-to-end memory test

If you want, I can give you a tiny ā€œtest Ollama, then switch back to Voyage→Ollama fallbackā€ workflow.

faint prairie
#

running into issues with voyage, what are some good test commands i can run

amber anvilBOT
#

Yep, I’d test Voyage in this order:

1. Basic health

openclaw status
openclaw models status

2. Check embedding providers directly

openclaw infer embedding providers --json

3. Raw Voyage embedding smoke test

openclaw infer embedding create \
#
  --provider voyage \
  --text "memory search smoke test" \
  --json

That’s the cleanest ā€œcan Voyage make embeddings at all?ā€ test.

4. Memory-search specific readiness

If you want to test memory search itself, temporarily isolate Voyage so it can’t silently fall through to Ollama:

memorySearch: {
  provider: "voyage",
  fallback: "none",
}
#

Then run:

openclaw memory status --deep

5. If it still fails, tail logs

openclaw logs --follow

My recommendation

If you’re debugging Voyage specifically, do both:

  • infer embedding create --provider voyage ...
  • memory status --deep

That tells you whether the problem is:

#
  • Voyage auth/provider setup
  • or the memorySearch wiring on top of it

If you paste the output from these 3, I can probably pinpoint it fast:

openclaw models status
openclaw infer embedding create --provider voyage --text "memory search smoke test" --json
openclaw memory status --deep
faint prairie
#

inferring voyage gave me a response in about 30 seconds and completed in about 2 minutes, with an "ok: true"
but my claw was saying that memory_search was timing out

amber anvilBOT
#

Yeah, that actually makes sense.

openclaw infer embedding create --provider voyage ... only proves Voyage can return an embedding eventually.

memory_search is tougher because it may also need to:

  • sync/reindex files first
  • chunk content into many embedding calls
  • do query embeddings under a stricter timeout

In the current memory code, the built-in remote timeouts are roughly:

  • query embedding: 60s
  • batch embedding: 120s

So if Voyage is taking ~30s to first response and ~2 minutes end-to-end, it’s very plausible that the live memory path hits those timeouts.

#

Good next tests

1. See whether it is query timeout or indexing timeout

openclaw memory status --deep --index --verbose

2. Force indexing outside the live chat path

openclaw memory index --force

Then try memory search again after the index is clean.

3. If indexing is the slow part, enable Voyage batch mode

Voyage supports memory batch embedding:

#
{
  agents: {
    defaults: {
      memorySearch: {
        provider: "voyage",
        remote: {
          batch: {
            enabled: true,
            timeoutMinutes: 10,
          },
        },
      },
    },
  },
}
#

4. If search is getting stuck trying to sync on every search

Temporarily disable that:

memorySearch: {
  sync: {
    onSearch: false,
  },
}

My guess

The likely issue is memory indexing/sync latency, not ā€œVoyage is brokenā€.

If you paste the output of:

#
openclaw memory status --deep --index --verbose

I can help tell whether it’s:

  • query timeout
  • batch timeout
  • sync/reindex doing too much work on search
faint prairie
#
  1. shows several "embeddings: batch start" and several "embeddings rate limited; retrying in x", and finalized with "memory index failed: fetch failed | other side closed"
amber anvilBOT
#

Yep, that’s a really useful signal.

What it says to me is:

  • Voyage auth is probably fine
  • the failure is happening during bulk indexing
  • you’re hitting rate limits, then the upstream connection is getting dropped (fetch failed | other side closed)

So this looks much more like throughput/backfill pressure than ā€œVoyage is brokenā€.

What I’d try next

1. Stop doing sync work during live searches

memorySearch: {
  sync: {
#
    onSearch: false,
  },
}

2. Turn on Voyage batch mode, and make it gentler

memorySearch: {
  provider: "voyage",
  fallback: "ollama",
  remote: {
    batch: {
      enabled: true,
      concurrency: 1,
      timeoutMinutes: 10,
#
    },
  },
  sync: {
    onSearch: false,
  },
}

Why:

  • enabled: true uses the batch path Voyage supports
  • concurrency: 1 reduces burst/rate-limit pressure
  • timeoutMinutes: 10 gives the batch job room to finish

3. Rebuild outside chat

#
openclaw gateway restart
openclaw memory index --force --verbose
openclaw memory status --deep

My read

Your issue is probably:

Voyage works for one-off embeddings, but memory indexing is generating enough requests that Voyage rate-limits it, then the connection gets closed.

If that config still ends with other side closed, I’d seriously consider either:

  • using Ollama as the actual memory provider
  • or filing a GitHub issue, because at that point it starts looking like a provider/runtime robustness bug rather than just config

If you want, I can turn that into one clean ready-to-paste memorySearch block for Voyage primary + Ollama fallback + safer indexing.

faint prairie
#

index --force is at 1%... eta is like a week hahahaha

#

oh wait no 300 minutes not hours, silly me

amber anvilBOT
#

Hah, yeah, that’s the moment where I’d stop and not try to brute-force it šŸ˜…

A week ETA usually means Voyage is the wrong provider for your current memory corpus / rate limits, at least for the initial backfill.

I’d recommend one of these:

1. Best practical option: use Ollama for memory embeddings

If you’re okay with local embeddings, this is probably the sane path.

memorySearch: {
  provider: "ollama",
  fallback: "none",
  sync: {
    onSearch: false,
  },
#
}

2. Keep Voyage, but shrink the scope a lot

Only index the minimum:

memorySearch: {
  provider: "voyage",
  fallback: "ollama",
  sources: ["memory"],   // no sessions
  sync: {
    onSearch: false,
  },
  // keep extraPaths very small or remove them for now
#
}

3. Don’t backfill everything

If you’ve got a huge workspace / lots of session history, a remote embedding provider with rate limits is just going to be miserable here.

Important caveat

If you build the index with Ollama and later switch to Voyage, OpenClaw will need to reindex again anyway, because provider/model changes change the embedding space.

So I’d pick the provider you actually want to keep.

My honest recommendation: for memory search, Ollama is probably the better fit unless you have a pretty small corpus and decent Voyage limits.

If you want, I can help you trim the memory scope so it’s usable instead of ā€œsee you next weekā€ šŸ˜„

#

Hah, okay, 300 minutes is way less cursed šŸ˜„

Still pretty slow, but that’s at least ā€œovernight-ishā€ instead of ā€œnext geological eraā€.

A couple of useful takeaways:

  • first full index is the worst case
  • once it’s built, incremental updates should be much smaller
  • if you’re still seeing rate limits / other side closed, I’d still be a bit wary of Voyage for large reindexes

So my updated read is:

  • if this is a one-time initial backfill and you don’t mind waiting, it might be acceptable
  • if it keeps failing partway through, I’d still lean toward Ollama or a much smaller indexed scope

If you let it finish, I’d test again after that before changing too much. That’ll tell you whether the real problem was just the initial bulk index.

faint prairie
#

it "timed out after 120s" but it also still got rate limited and switched to ollama half way through anyways

#

ohhh wait i'm reading it more, batch failed so it tried to not batch

amber anvilBOT
#

Ah, that’s actually very telling, and honestly kind of useful.

What happened is basically:

  1. Voyage batch hit the built-in 120s timeout
  2. it was also getting rate limited
  3. OpenClaw treated that as an embedding/batch failure
  4. then it activated your explicit fallback and switched to Ollama

So yes, that lines up with the current fallback behavior.

The good news is, it shouldn’t leave you with a mixed embedding space. After fallback activation, memory does a forced safe reindex with the fallback provider.

Practical takeaway

If Voyage is consistently doing:

  • rate limit
#
  • 120s timeout
  • then fallback to Ollama anyway

…then I probably wouldn’t keep Voyage as the primary for memory search, because you’re basically paying a 120-second failure tax before ending up on the provider that works.

At that point I’d either:

Option A, simplest

Just use Ollama directly:

memorySearch: {
  provider: "ollama",
  fallback: "none",
}
#

Option B, if you still want to keep testing Voyage

Try the explicit batch config first:

memorySearch: {
  provider: "voyage",
  fallback: "ollama",
  remote: {
    batch: {
      enabled: true,
      concurrency: 1,
      timeoutMinutes: 10,
    },
  },
}
#

But if it still ends up on Ollama, I’d just make Ollama primary and call it a day.

If this solved it, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

#

Yep, good catch, you’re reading that right.

There are really 2 different fallback layers happening:

  1. batch embeddings fail
    → OpenClaw falls back to non-batch embeddings for the same provider

  2. if embeddings still keep failing after that
    → then it can fall back to your configured provider fallback (like Ollama)

So the flow is more like:

Voyage batch → Voyage non-batch → Ollama (if needed)

Also, batch gets effectively ā€œdemotedā€ after repeated failures, it isn’t just blindly retried forever.

That means the real question is now:

#
  • does Voyage non-batch work well enough for your corpus?
  • or does it still rate limit / timeout badly enough that you end up on Ollama anyway?

If Voyage non-batch is stable, this may be less bad than it first looked. If it still collapses into Ollama regularly, I’d still just make Ollama primary for memory search.

Nice catch, that changes the interpretation a lot.

faint prairie
#

so from top to bottom, it tries to batch, fails, tries non-batch, rate limits, tries ollama, times out at 120s (and would take hours regardless).
can we figure out why voyage batch is failing and go from there?

amber anvilBOT
#

Yeah, and I think there’s a naming trap here.

The log line you mentioned earlier, memory embeddings: batch start, is probably not Voyage’s dedicated Batch API yet. That is the normal ā€œsend multiple texts in one embeddings requestā€ path.

The actual Voyage Batch API path should log things more like:

  • memory embeddings: voyage batch submit
  • memory embeddings: voyage batch created
  • batch status / waiting messages with a batch id

So first I’d verify which batch path you’re actually on.

Clean way to test real Voyage Batch API

Temporarily isolate it:

#
memorySearch: {
  provider: "voyage",
  fallback: "none",
  sync: {
    onSearch: false,
  },
  remote: {
    batch: {
      enabled: true,
      concurrency: 1,
      wait: true,
      pollIntervalMs: 2000,
      timeoutMinutes: 10,
    },
  },
#
}

Then restart and run:

openclaw memory status --deep --index --verbose
openclaw logs --follow

What to look for

If you see:

  • voyage batch created
  • a batch id
  • status polling
#

Then you’re on the real Voyage Batch API path.

If you only see:

  • memory embeddings: batch start
  • rate limited
  • timed out after 120s

Then you’re still on the regular embedBatch path, not the dedicated Voyage Batch API.

Why this matters

Those are different failure modes:

  • regular embedBatch path failing means the normal /embeddings route is too slow / rate-limited
  • Voyage Batch API failing would point to file upload, batch creation, status polling, or output retrieval

So yes, we can absolutely figure it out, but step 1 is making sure we’re testing the actual Voyage Batch API and not just the generic batched embeddings path.

#

If you want, once you run that, paste the first few voyage batch ... lines (or confirm they never appear), and I can tell you exactly which path is breaking.

faint prairie
#

ah... gotta add a payment method to the account to unlock the true "free tier"

#

recommended usage limit so i don't get charged and maintain the free tier?

amber anvilBOT
#

Yep, that’s the classic ā€œfree tierā€ gotcha šŸ˜…

That would explain a lot of the weirdness, especially if Voyage is keeping you on very constrained limits until billing is set up.

So at this point the choice is basically:

  • don’t add a payment method → use Ollama for memory embeddings
  • add a payment method → retest Voyage with real limits, ideally with the actual batch API enabled

Honestly, if your goal is just ā€œmake memory search work reliably,ā€ Ollama is still the lower-drama option.

If this thread is basically solved now, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it for future users.

#

I don’t think there’s really a ā€œset this limit and keep the free tierā€ knob.

From Voyage’s docs, the model is more like:

  • add a payment method to unlock the higher tier / normal limits
  • your free tokens still apply first
  • after that, usage is billed

So the safest answer is:

  • if you want guaranteed $0 spend, stay under the free token allotment
  • I do not currently see a documented hard spend cap / billing limit in their docs
  • there’s even a docs forum request asking for that feature, which is… not very comforting šŸ˜…

For the models we were talking about, their pricing docs currently say:

  • 200M free tokens on the main voyage-4* family
#
  • 50M free tokens on some other models

So if you want the conservative answer: do not plan to cross the free-token threshold at all.

If you want the practical answer: I’d personally either:

  1. use Voyage only for small tests and keep a close eye on usage, or
  2. use Ollama if the goal is ā€œabsolutely no surprise billā€

Docs:

faint prairie
#

okay it's working now, i'm watching it batch, and carefully watching my free tier usage so i don't get lambasted

#

and after the index is batched it should be light usage from there?

#

is seeing voyage batch batch-xxxxx in_progress waiting 2000ms over and over correct?

amber anvilBOT
#

Yep, that part is normal.

That log line is just the poll loop for Voyage’s Batch API:

  • batch created
  • status = in_progress
  • wait 2000ms
  • check again
  • repeat until it becomes completed (or hits the timeout)

So:

  • seeing it a few/many times: normal
  • seeing it forever until timeout: not normal, that means the batch is just taking too long / stuck / queued on Voyage’s side

The key question is whether it eventually changes from in_progress to completed.

#

If it never does and then dies on timeout, I’d increase timeoutMinutes and keep concurrency: 1, or just accept that Voyage batch is too slow for this corpus.

faint prairie
#

i see "finalizing" and "validating" now

amber anvilBOT
#

Nice, that’s actually a good sign.

validating and finalizing mean the batch moved past the plain in_progress stage and Voyage is wrapping it up.

So the rough progression is:

  • in_progress
  • validating
  • finalizing
  • completed

If it reaches completed, then the Batch API path itself is working, and the issue shifts from ā€œbatch is brokenā€ to ā€œis it fast enough / practical enough for your index sizeā€.

That’s progress šŸ™‚

faint prairie
#

it only needs to do this once though, right? unless at some point i need to re-index

amber anvilBOT
#

Yeah, mostly once for the big initial backfill.

After that, normal file/session changes should usually be incremental, not ā€œdo the whole universe againā€.

You’ll typically need a full reindex again if you change things like:

  • provider (voyage → ollama, etc.)
  • model
  • outputDimensionality
  • indexed scope (sources, extraPaths, multimodal settings)
  • chunking settings
  • some index/storage settings like the FTS tokenizer
  • or if you manually run openclaw memory index --force

So the short version is:

  • initial setup: big expensive one-time job
#
  • normal use: incremental updates
  • config/provider/model changes: full reindex again

So if this Voyage batch finishes successfully, you should not have to sit through that exact full backfill again unless you change something important.

faint prairie
#

and if it is less than 200 million tokens (unlikely) for the re-index then it should be good for usage, unless they change their free policy

amber anvilBOT
#

Yep, in principle that’s the idea.

If you’re on one of the Voyage models with the 200M free-token allotment, and your total embedding usage stays under that, then based on the current pricing docs you should be okay.

Main caveats:

  • the free allotment is model-specific
  • it’s total usage, not just this one reindex
  • ongoing updates / future reindexes also count
  • and yeah, Voyage could always change the policy later

So I’d phrase it as:

ā€œIf my memory indexing + ongoing embedding usage stays below the current free-token allotment for this model, I should avoid charges.ā€

That’s the cautious version šŸ™‚

faint prairie
#

ohhh right because the 200 million doesn't reset per month or year, it's forever

amber anvilBOT
#

Yep, that’s how I read their wording too.

ā€œThe first 200 million tokens are free for every accountā€ reads like a one-time cumulative allotment, not a monthly/yearly reset.

So unless their docs say otherwise somewhere else, I would assume:

  • 200M total free once
  • after that, paid

I definitely would not plan around it resetting.

faint prairie
#

batching is back to "in_progress" again, is that normal?

amber anvilBOT
#

If it’s a new batch id, yep, that’s normal.

OpenClaw can split the work into multiple Voyage batch groups, so you might see:

  • batch A → in_progress → validating → finalizing → done
  • then batch B → back to in_progress

That part is fine.

If it’s the same exact batch id going from finalizing back to in_progress, that would be weirder, and I’d want to look closer at the logs.

So the key question is: same batch id, or a new one?

faint prairie
#

new one i think

amber anvilBOT
#

Then yeah, that’s normal šŸ‘

That just means the index job got split into another Voyage batch group, so you’re watching the next chunk of work start.

So in that case:

  • old batch finished / moved on
  • new batch started
  • status goes back to in_progress

Annoying to watch, but not a bad sign by itself šŸ˜„

faint prairie
#

sooo how long should i expect this to go for about a month of chatting and debugging openclaw stuff?

amber anvilBOT
#

For just about a month of chat/debugging, my gut check is:

  • tens of minutes to a few hours is believable
  • days would feel wrong
  • half a day+ usually means you’re indexing more than just chats, or Voyage limits are the bottleneck

What really changes it:

  • whether you’re indexing only session transcripts or also files / extraPaths
  • how large the chats are
  • how much code/log paste is in them
  • Voyage’s current queue / rate limits

So if this is truly just ā€œa month of OpenClaw support/debug chatsā€, I’d expect hours, not days.

The good news is: once this initial backfill is done, future updates should be way smaller.

faint prairie
#

my only extra path is openclaw docs

amber anvilBOT
#

Ah, okay, that explains it a lot more.

The OpenClaw docs tree is not tiny, so:

  • one month of chats/debug logs
  • plus the docs directory
  • chunked into lots of embedding inputs

…can absolutely turn the first backfill into an hours-scale job on Voyage.

So that sounds annoying but plausible, not obviously broken.

If your main goal is conversation recall, I’d honestly consider:

  • first indexing just sessions / memory
  • then adding the docs path later if you still want it
#

Because the docs are probably a big chunk of what you’re paying for in time here.

So yeah, with extraPaths = [openclaw docs], I’m much less surprised by the runtime.

faint prairie
#

but surely it's not 50 million or 200 million tokens

amber anvilBOT
#

Yeah, I’d be pretty surprised if one month of chats + the OpenClaw docs were anywhere near 50M, let alone 200M tokens.

So I wouldn’t interpret the long runtime as ā€œyou must be chewing through hundreds of millions of tokensā€.

It’s much more likely a mix of:

  • batch queueing
  • rate limits
  • chunking overhead
  • retries / polling
  • just the general slowness of remote embedding backfills

So:

  • slow does not automatically mean huge token count
  • your corpus can be pretty normal-sized and still take a while if the provider is throttling
#

That said, repeated reindexes over time are what would start to make me care about the 50M / 200M numbers.

faint prairie
#

Okay because it's late and I may need to let it go overnight haha. We're at 120k tokens right now

amber anvilBOT
#

Yeah, that sounds totally reasonable.

And 120k tokens is tiny relative to 50M / 200M, so from a usage-cost perspective that’s not scary at all.

If it’s late, I’d honestly just let it run overnight and check in the morning with:

openclaw memory status --deep

and maybe a quick:

openclaw memory search "some topic you know is in your chats"

If it finishes cleanly, you’re probably in good shape. If it dies overnight, the new logs should be much more informative.

faint prairie
#

uh oh i just saw the same batch id go from "validating" to "in_progress"

amber anvilBOT
#

Yeah, that’s definitely weirder than the clean happy path.

I wouldn’t call it instantly fatal, but a same batch id going:

validating → in_progress

suggests one of:

  • Voyage’s status reporting is a bit flaky / non-monotonic
  • the job got re-queued or re-opened on their side
  • the status API is lagging / inconsistent

So my read is:

  • not ideal
  • probably provider-side weirdness
  • if it still eventually finishes, I wouldn’t panic
#
  • if it keeps bouncing like that and then times out, that’s much stronger evidence the Voyage batch path is flaky here

If it’s late, I’d probably let it keep going a bit, but if the same id keeps oscillating repeatedly, I’d save that batch id because that’s great bug-report evidence.

faint prairie
#

welp problem solved i accidentally CTRL-C'd to copy forgetting i was in linux terminal so it ended the process

amber anvilBOT
#

Hah, yep, that would do it šŸ˜„

That actually explains the weirdness a lot better than ā€œVoyage invented a new batch state machine.ā€

I’d just rerun one of these:

openclaw memory status --deep --index

or, if you want the cleanest possible rebuild after the interruption:

openclaw memory index --force

Given the interrupted run, I’d personally lean --force if you’re okay letting it churn again.

#

And yeah, Linux terminal copy muscle memory gets everybody eventually. If this thread is basically solved now, please right-click the solution message → Apps → āœ… Mark Solution so Answer Overflow can index it.

faint prairie
#

i think i'll just switch to ollama so i don't have to deal with voyage potentially charging me in the future. what's the best config look like for that

#

i recall it was timing out after 120s last time

amber anvilBOT
#

Yeah, honestly, that’s the lower-drama move.

Best simple config for local Ollama memory search

.env

OLLAMA_API_KEY=ollama-local

openclaw.json

{
  agents: {
    defaults: {
      memorySearch: {
        provider: "ollama",
#
        fallback: "none",
        model: "nomic-embed-text",
        sync: {
          onSearch: false, // optional, but I recommend it
        },
      },
    },
  },
}

That’s the config I’d start with if:

  • Ollama is running locally
  • it’s on the default URL (http://127.0.0.1:11434)
  • you want Ollama only for memory embeddings
#

Only add this if Ollama is on a custom host/port

{
  models: {
    providers: {
      ollama: {
        baseUrl: "http://127.0.0.1:11434",
      },
    },
  },
}

Important: use the native Ollama URL, not /v1.

After changing provider

Since this is a provider switch, you’ll want a fresh reindex:

#
openclaw gateway restart
openclaw memory index --force

My recommendation

I’d keep:

  • your current sources
  • your current extraPaths

and just swap the provider over to Ollama first.

If you want, I can give you one ready-to-paste full config block for:

  • sessions + docs
  • Ollama only
  • safe defaults for overnight indexing
#

Yep, good thing to keep in mind.

One subtle gotcha: switching to Ollama does not automatically mean a bigger timeout.

In the current memory code, the longer ā€œlocalā€ timeout is for the special memorySearch.provider = "local" path, not for ollama. So with Ollama, you’re still generally relying on the normal remote-style memory timeouts.

So the bet with Ollama is not ā€œhigher timeoutā€, it’s:

  • no Voyage rate limits
  • no provider queueing
  • usually much faster actual response time

So if your Ollama instance is reasonably fast, it should still be a lot better. But if your local machine is really slow, yes, it could still hit timeouts.

That said, I’d still expect Ollama to be much less annoying than Voyage for this use case.

If you want the cautious path, switch to Ollama and do a small smoke test first:

#
openclaw infer embedding create \
  --model ollama/nomic-embed-text \
  --text "memory search smoke test" \
  --json

Then run the full reindex.

faint prairie
#

any memorysearch.remote settings i can use?

amber anvilBOT
#

Yep, there are a few.

memorySearch.remote settings

memorySearch: {
  remote: {
    baseUrl: "...",
    apiKey: "...",
    headers: { ... },

    batch: {
      enabled: true,
      wait: true,
      concurrency: 1,
      pollIntervalMs: 2000,
#
      timeoutMinutes: 10,
    },
  },
}

What they do

  • remote.baseUrl
    override the embedding endpoint base URL
    useful for Ollama on a custom host/port or custom OpenAI-compatible endpoints

  • remote.apiKey
    override the key just for memory search

  • remote.headers

#

add extra HTTP headers

  • remote.batch.enabled
    enable the provider’s async batch API

  • remote.batch.wait
    wait for batch completion vs fire-and-poll behavior

  • remote.batch.concurrency
    how many batch jobs run in parallel

  • remote.batch.pollIntervalMs
    how often to poll batch status

  • remote.batch.timeoutMinutes
    how long to wait for the async batch job before timing out

#

Important caveat

For Ollama, the practically useful ones are mostly:

  • baseUrl
  • apiKey
  • headers

The batch settings are mainly relevant for providers like Voyage, Gemini, and OpenAI that expose async batch embedding support.

Also, memorySearch.remote.* does not give you a knob for the generic non-batch query timeout.

Example for Ollama

{
  agents: {
    defaults: {
#
      memorySearch: {
        provider: "ollama",
        fallback: "none",
        model: "nomic-embed-text",
        remote: {
          baseUrl: "http://127.0.0.1:11434",
        },
      },
    },
  },
}

If you want, I can give you the exact memorySearch block I’d use for local Ollama only.

faint prairie
#

so no batch settings for nomic?

amber anvilBOT
#

Right, basically no.

For Ollama + nomic-embed-text, the memorySearch.remote.batch.* knobs are not the useful path in current OpenClaw.

Those async batch settings are for providers that expose a dedicated batch embedding flow, like Voyage, Gemini, and OpenAI.

For Ollama/nomic, the useful knobs are basically:

  • memorySearch.provider
  • memorySearch.model
  • memorySearch.remote.baseUrl
  • memorySearch.remote.apiKey
  • memorySearch.remote.headers

So if you switch to Ollama, I’d stop thinking about batch tuning and just focus on:

  • local Ollama running cleanly
#
  • correct baseUrl
  • whether it’s fast enough on your machine

If you want, I can give you the cleanest Ollama + nomic-embed-text config block to paste in.

faint prairie
#

how do i fix the 120s timeout? it's going to take a lot longer than that

amber anvilBOT
#

Short answer: for Ollama/nomic, I don’t think you can fix that with config today.

There are 2 different cases:

1. If you’re using a provider with async batch support

Like Voyage / OpenAI / Gemini, then yes, you can raise it with:

memorySearch: {
  remote: {
    batch: {
      enabled: true,
      timeoutMinutes: 10,
    },
  },
}
#

2. If you’re using Ollama + nomic-embed-text

That 120s limit is on the normal non-batch embedding path, and I don’t see a memorySearch.remote.* knob for changing that.

So for Ollama, your options are basically:

  • reduce the indexing scope
  • use a faster machine / faster Ollama setup
  • use the special memorySearch.provider = "local" path instead of ollama if you want the longer built-in local timeout behavior
  • or file a GitHub issue asking for configurable non-batch memory embedding timeouts: https://github.com/openclaw/openclaw/issues

So the honest answer is:

Voyage batch timeout is configurable. Ollama/nomic timeout is not, at least not from normal memorySearch config.

If you want, I can help you decide whether ollama or local is the better fit for your setup.

wooden heathBOT
faint prairie
#

Can local point at ollama? Or is it something else entirely. What's the max timeout on local? It may still not be high enough

amber anvilBOT
#

local is something else entirely, it does not point at Ollama.

local vs ollama

  • ollama = talk to a running Ollama server over HTTP
  • local = OpenClaw’s built-in local embedding path, using a GGUF model via node-llama-cpp

So local is more like:

  • download / point at a GGUF embedding model
  • OpenClaw runs it directly

It is not:

  • baseUrl: http://localhost:11434
  • or ā€œOllama but localā€

Current built-in local timeouts

From the current memory code, local gets longer built-in limits:

#
  • query: about 5 minutes
  • batch: about 10 minutes

That’s better than the normal remote-style 60s / 120s path.

Important caveat

I still don’t see a normal config knob to push those even higher.

So if 10 minutes for a batch still won’t cut it, then honestly the bigger issue may be:

  • corpus too large for the chosen local setup
  • model/machine too slow
  • or memory scope too broad

Practical takeaway

  • if you want Ollama, use provider: "ollama"
  • if you want the longer built-in timeout, use provider: "local"
faint prairie
#

This is certainly a pickle. Memory search is a core feature for agents to use, but I can't seem to get it to be available for truly free. It'd be one thing if i could use the same indexes for different providers but you can't. Getting a better machine is not an option and shrinking the memory scope defeats the purpose

amber anvilBOT
#

Yeah, I think you’re hitting a real constraint, not missing some secret config.

Today, I don’t think there’s a magic combo that gives you all 4 at once:

  • semantic memory search
  • large scope
  • truly free
  • no decent local hardware

That’s the frustrating tradeoff.

So the realistic options are basically:

  1. Remote embeddings
    good semantic recall, but potential cost / rate limits

  2. Local/Ollama/local GGUF

#

free-ish, but performance/hardware constrained

  1. FTS-only memory
    truly free, same broad scope, but keyword/lexical instead of semantic/vector

Honestly, if ā€œfree + big scopeā€ matters more than perfect semantic recall, the interesting compromise might be FTS-only. It’s weaker, but it avoids both the provider-cost problem and the hardware problem.

So my blunt take is:

there is no perfect free semantic-memory path here right now, but there may be a usable free FTS-only path.

If you want, I can help you test whether FTS-only memory search is good enough for your setup. That might actually be the least painful answer.

faint prairie
#

Help me get local setup step by step tomorrow, please, (and uninstall ollama lol)
Failing that help me use fts

amber anvilBOT
#

Yep, absolutely, I can help you do that tomorrow.

And honestly, I’d not uninstall Ollama yet. I’d just stop using it for memory search for now. It’s easier to compare paths if it’s still there, and uninstalling adds more chaos than value tonight.

Tomorrow’s plan

Plan A, get memorySearch.provider = "local" working

We’ll do it step by step:

  1. confirm the install/runtime state
  2. check whether the local embedding runtime is actually available
  3. switch memory search to local
  4. run a clean reindex
  5. test recall

Plan B, if local is still too painful

We pivot to FTS-only and make sure keyword memory search is at least usable.

#

When you’re back tomorrow

Send me these:

openclaw --version
openclaw status
openclaw memory status --deep

And tell me:

  • are you running OpenClaw via npm or pnpm
  • what OS you’re on
  • whether you want to keep the docs path indexed

Then I’ll walk you through the local setup first, and if it’s still cursed, we’ll switch to FTS.

#

Get some sleep šŸ˜„ this is definitely tomorrow-problem territory.

faint prairie
#

okay let's set up local memory search. 2026.4.15, debian 13, yes keep the docs path, npm

amber anvilBOT
#

Yep. Let’s do the smallest safe local setup first and keep your docs path.

Docs I’m basing this on:
https://docs.openclaw.ai/reference/memory-config
https://docs.openclaw.ai/concepts/memory-search
https://docs.openclaw.ai/start/faq

Step 1, switch memory search to local

openclaw config set agents.defaults.memorySearch.provider local
openclaw config set agents.defaults.memorySearch.fallback none

You do not need to set local.modelPath yet.
If unset, OpenClaw should use the default local embedding model:

  • embeddinggemma-300m-qat-Q8_0.gguf
#
  • about 0.6 GB
  • auto-downloaded on first use

Step 2, restart the gateway

openclaw gateway restart

Step 3, check whether local embeddings are actually ready

openclaw memory status --deep

What I want to see is basically:

  • provider = local
  • no node-llama-cpp missing error
#
  • no ā€œlocal embeddings unavailableā€ error

Step 4, if that looks good, start a clean reindex

openclaw memory index --force

Since you’re changing providers, a fresh rebuild is the right move.

Step 5, test it

After indexing, try a couple searches:

openclaw memory search "gateway token field in control ui"
openclaw memory search "voyage timeout 120s"
#

Important note

The first local run may spend a bit downloading/preparing the GGUF model, so don’t panic if it doesn’t instantly jump into indexing.

If Step 3 fails

Paste the output of:

openclaw memory status --deep

That will tell us whether the problem is:

  • missing node-llama-cpp
  • model download/setup
  • or something else runtime-related

Start with Steps 1 to 3 and paste the status output before you kick off a huge reindex. That’ll save you pain.

faint prairie
#

does it auto download on gateway restart or on index or on memory status?

amber anvilBOT
#

Not on gateway restart.

It should happen on the first actual embedding call, so in practice usually one of these:

  • openclaw memory status --deep
  • openclaw memory index --force
  • a memory search that needs embeddings

So the safest expectation is:

  • restart: no download yet
  • plain openclaw memory status: probably no download
  • deep status / index / real embed probe: yes, that’s when it should kick off

That’s why I wanted you to do openclaw memory status --deep before a huge reindex, it’s a nice smaller ā€œdoes local actually wake up?ā€ test.

faint prairie
#

so what's the full config section for memorySearch look like, including any ideal remote.batch or other settings?

amber anvilBOT
#

For local, I’d keep it pretty clean. The big thing is: don’t bother adding remote.batch here, it won’t help the local provider.

Recommended memorySearch block for your case

{
  agents: {
    defaults: {
      memorySearch: {
        enabled: true,

        // Force fully local embeddings
        provider: "local",
        fallback: "none",

        // Keep transcript recall
#
        experimental: {
          sessionMemory: true,
        },
        sources: ["memory", "sessions"],

        // Keep your docs path
        extraPaths: [
          "/home/YOUR_USER/openclaw-ref/docs", // or whatever docs path you're already using
        ],

        // Easiest starting point: use the default auto-downloaded GGUF
        local: {
          // modelPath: "hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf",
          // modelCacheDir: "/home/YOUR_USER/.cache/node-llama-cpp",
        },
#

        // Helps avoid re-embedding unchanged content later
        cache: {
          enabled: true,
          maxEntries: 50000,
        },

        // Good default ranking setup
        query: {
          hybrid: {
            enabled: true,
            vectorWeight: 0.7,
            textWeight: 0.3,
            mmr: {
              enabled: true,
#
              lambda: 0.7,
            },
            temporalDecay: {
              enabled: true,
              halfLifeDays: 30,
            },
          },
        },

        // Avoid surprise ā€œsearch triggered a giant syncā€ latency
        sync: {
          onSearch: false,
          sessions: {
            deltaBytes: 100000,
            deltaMessages: 50,
#
          },
        },
      },
    },
  },
}

Notes

  • remote.batch: skip it for local
  • fallback: "none": avoids quietly jumping to a paid provider
  • cache.enabled: true: worth it
  • onSearch: false: I recommend this, especially while you’re stabilizing things

After saving config

#
openclaw gateway restart
openclaw memory status --deep

If that looks healthy, then:

openclaw memory index --force

If you ever go back to a remote provider

Then the only batch block I’d use is:

remote: {
#
  batch: {
    enabled: true,
    wait: true,
    concurrency: 1,
    pollIntervalMs: 2000,
    timeoutMinutes: 30,
  },
}

But again, that is for Voyage/OpenAI/Gemini, not local.

If you want, paste your current memorySearch block and I’ll rewrite it into the exact final version for your setup.

faint prairie
#

memory status --deep showed failure, local embeddings unavaiable, optional dependency node-llama-cpp missing or failed to install

amber anvilBOT
#

Okay, that’s actually useful. Your config is no longer the problem. The issue is just that the local runtime dependency never installed cleanly.

Since you’re on npm, I’d do this next:

1. Check Node version

node -v

For local embeddings, I’d really prefer:

  • Node 24 for installs/updates
  • or Node 22.14+

2. Check whether npm is skipping install scripts

#
npm config get ignore-scripts

If that says true, that’s likely the culprit.

3. Reinstall OpenClaw globally

npm install -g openclaw@latest

4. Restart gateway

openclaw gateway restart

5. Re-test local embeddings

#
openclaw memory status --deep

What I want you to paste back

Please paste:

  • node -v
  • npm config get ignore-scripts
  • and the result of the fresh openclaw memory status --deep

If reinstall still fails, paste the npm install -g openclaw@latest error output too.

That’ll tell us whether this is:

#
  • wrong Node version
  • npm scripts being skipped
  • or a native build failure during install

At least we’ve got the problem narrowed down now šŸ™‚

faint prairie
#

i'm on node 24 and ignore-scripts is false

#

will the reinstall hurt my current install at all?

amber anvilBOT
#

Good, that rules out the two easy culprits.

So the next most likely issue is simply: your current global OpenClaw install does not have node-llama-cpp present.

Next step

Reinstall OpenClaw globally:

npm install -g openclaw@latest

Then:

openclaw gateway restart
openclaw memory status --deep
#

If it still says node-llama-cpp missing

Run this too:

npm ls -g openclaw node-llama-cpp --depth=0

and paste the output.

That’ll tell us whether:

  • node-llama-cpp was actually installed
  • or OpenClaw is there but the optional dep still isn’t landing

So yep, at this point I’d do the reinstall first.

#

It shouldn’t hurt your actual data/config, no.

A normal global reinstall should not wipe:

  • your openclaw.json
  • your workspace
  • your session/memory data in ~/.openclaw

What it does affect is basically:

  • the global CLI package files
  • bundled/optional dependencies like node-llama-cpp

So the normal risk is more like:

  • brief CLI / gateway downtime
  • install fails and you need to rerun it
#
  • version/package contents change

Not usually:

  • ā€œall my OpenClaw state is goneā€

If you want the belt-and-suspenders version, back up your config first, but I would not expect reinstalling openclaw@latest to nuke your current setup.

wooden heathBOT
faint prairie
#

where should it install global npm packages by default

amber anvilBOT
#

It depends on your npm global prefix.

On Linux, the common defaults are:

  • system Node → usually
    /usr/local/lib/node_modules
    with binaries in
    /usr/local/bin

  • nvm-managed Node → usually something like
    ~/.nvm/versions/node/v24.x.x/lib/node_modules
    with binaries in
    ~/.nvm/versions/node/v24.x.x/bin

Best way to check your actual machine:

#
npm prefix -g
npm root -g
which openclaw

Those will tell you:

  • the global prefix
  • the global package dir
  • which openclaw binary you’re actually using

If you paste those, I can sanity-check whether your install path looks right.

faint prairie
#

okay so i did everything you said and it's the same error. cannot find package node-llama-cpp imported from (my global npm folder)/openclaw/dist/engine-embeddings-bkxxxxx.js (file does exist)

amber anvilBOT
#

Ah, I found the real reason, and I need to correct my earlier advice.

node-llama-cpp is currently an optional peer dependency, not something openclaw necessarily installs for you automatically. So reinstalling OpenClaw alone can leave you in exactly the same broken state.

So the fix is to install it explicitly into the same global npm prefix:

npm install -g node-llama-cpp@3.18.1

Then:

openclaw gateway restart
openclaw memory status --deep
#

Why this happens

OpenClaw is trying to import:

  • node-llama-cpp

from inside its global install, but that package is not actually present in the global module path.

If you want to verify after install

npm ls -g openclaw node-llama-cpp --depth=0
npm root -g

You should see both:

  • openclaw
  • node-llama-cpp
#

in the same global tree.

So yeah, the missing piece is not ā€œreinstall OpenClaw againā€, it’s:

install node-llama-cpp explicitly

That should get us a lot closer.

faint prairie
#

does it have to be that version or can it be latest

amber anvilBOT
#

I’d use the exact version first.

OpenClaw currently declares:

  • node-llama-cpp: 3.18.1

as the peer dependency, so the safest move is:

npm install -g node-llama-cpp@3.18.1

Could latest work? Maybe. But if latest has API/packaging changes, you’ll just introduce another variable.

So my recommendation is:

  • for debugging / first success → use 3.18.1
#
  • once it works, you can experiment later if you really want

I wouldn’t use latest for the first fix.

faint prairie
#

my npm installs complain about an unsupported npmrc setting, "onlyBuiltDependencies[]=openclaw" is that important?

amber anvilBOT
#

Nope, not important for npm.

onlyBuiltDependencies[]=openclaw is basically a pnpm-specific setting. npm doesn’t support it, so it just complains and ignores it.

So for your case:

  • annoying warning: yes
  • cause of the node-llama-cpp missing error: no

The thing that matters is still whether node-llama-cpp actually ends up installed in the same global npm tree as openclaw.

So I’d treat that npmrc warning as noise, not the root problem.

faint prairie
#

okay memory --deep is downloading stuff now

amber anvilBOT
#

Nice, that’s actually the first really good sign šŸŽ‰

That means:

  • node-llama-cpp is being found now
  • the local provider is finally waking up
  • it’s likely pulling the default GGUF model / runtime bits

So at this point I’d just let it finish and not interrupt it.

Best case, after the download completes, openclaw memory status --deep comes back healthy.
Then the next step is:

openclaw memory index --force
#

If memory status --deep still errors after the download, paste the new error, but this is definitely progress.

faint prairie
#

oh it did say additional memory path missing. it looks like it's prepending ~/.openclaw/workspace to the string in that setting automatically, so i'd just need "docs" instead of ~/.openclaw/workspace/docs

amber anvilBOT
#

Yep, that makes sense, and good catch.

In config, extraPaths supports:

  • absolute paths
  • workspace-relative paths

But ~ is not shell-expanded there.

So:

  • docs → resolved as ~/.openclaw/workspace/docs if that’s your workspace
  • ~/.openclaw/workspace/docs → gets treated like a relative string and ends up wrong
  • /home/youruser/.openclaw/workspace/docs → true absolute path, also fine

So yes, if the docs are inside your workspace, just using:

#
extraPaths: ["docs"]

is the clean fix.

And since changing extraPaths changes indexing scope, expect a full reindex after that.

faint prairie
#

anything you want to see from memory status --deep before i kick off index --force?

amber anvilBOT
#

Yep, just a few things.

Before index --force, I’d want memory status --deep to show no obvious red flags around:

  • provider = local
  • embeddings probe = ok / ready
  • vector search available
  • no node-llama-cpp error
  • no ā€œadditional memory path missingā€ warning
  • your docs path resolving the way you expect

If those all look clean, I’d go ahead with:

openclaw memory index --force
#

If you want, paste the memory status --deep output and I’ll do a quick sanity check before you kick it off.

faint prairie
#

embeddings probe did mention that the prebuilt binary for platform linux x64 with vulkan support is not compatible with the current system, falling back to using no GPU.
but i think that's just info. everything else looks clean

amber anvilBOT
#

Yep, that sounds like informational fallback, not a blocker.

It basically means:

  • the Vulkan-flavored prebuilt didn’t fit your system
  • node-llama-cpp fell back to CPU
  • local embeddings should still work, just slower

So if everything else looks clean, I’d go ahead with:

openclaw memory index --force

That warning only becomes important if indexing is painfully slow, but it shouldn’t stop the run.

faint prairie
#

this is a 14-year old thinkpad with 8gb of ram, i think you'd be surprised how slow it might be

amber anvilBOT
#

Hah, okay, fair, I retract my optimism šŸ˜„

On a 14-year-old ThinkPad with 8 GB RAM, CPU-only local embeddings could absolutely be glacial.

So I’d frame this run as a viability test, not ā€œthis will definitely be pleasantā€.

My honest expectation:

  • it might work
  • it may take quite a while
  • and if the machine starts suffering, FTS-only may still end up being the saner long-term answer

I’d still let index --force try, but with very modest expectations. If it starts looking cursed, we can pivot fast instead of pretending this laptop is a datacenter.

faint prairie
#

the jet engine has spun up

amber anvilBOT
#

Yep, that’s the authentic local-AI startup sound šŸ˜„

At this point I’d say:

  • let it cook
  • avoid doing other heavy stuff on the machine
  • if it actually finishes, that ThinkPad earns a medal

If it stalls, OOMs, or takes forever, we’ll call it and pivot to FTS-only without shame.

faint prairie
#

my only concern is some baked in timeout. eta is only 25 minutes but didn't you mention 5 or 10 minute timeout the other day?

amber anvilBOT
#

Yep, but the important distinction is:

  • 25 minutes total index time can be totally fine
  • the 5 / 10 minute timeout is for an individual embedding operation / batch, not the entire end-to-end index run

So if indexing is marching along chunk by chunk, a 25-minute overall ETA is not automatically a problem.

The timeout only bites if something like:

  • one query embed takes over ~5 minutes
  • or one batch embed step takes over ~10 minutes

So I’d watch for:

  • steady progress = probably okay
  • one step hanging forever = that’s the danger zone
#

In other words, long overall runtime is okay, stuck single operation is not.

faint prairie
#

i mean ollama was like 300 minutes so this is way better, although it IS going up a bit now

amber anvilBOT
#

Oh, then yeah, this is much more promising.

If local is showing tens of minutes instead of 300 minutes, that’s a huge improvement.

And the ETA creeping upward a bit is pretty normal, especially when:

  • the early sample was too optimistic
  • chunk sizes vary
  • the machine starts heat/throttle juggling on CPU

So I’d read this as:

  • not fast
  • but plausibly usable
  • and definitely better than the Ollama path on this machine

Honestly, I’m encouraged by that.

faint prairie
#

i see it says indexing memory files 7/52, but memory status showed "indexed 30/211 files". so 52 vs 211? is that actual memory files vs session files? does that mean node-llama won't be doing session files?

amber anvilBOT
#

Yep, that’s almost certainly what’s happening.

  • 52 = the current memory-file phase
  • 211 = the broader total from memory status, likely memory + sessions combined

So no, that does not mean node-llama-cpp won’t handle session files.

More likely the flow is:

  1. Indexing memory files (7/52)
  2. then later Indexing session files
  3. and memory status is showing the bigger combined picture

So the key thing to watch is whether it later switches to a label like:

  • Indexing session files…
#

If it does, that confirms sessions are part of the run too.

If you have sources: ["memory", "sessions"], local embeddings should apply to both.