#memory_search provider and fallback(s)
1 messages Ā· Page 1 of 1 (latest)
Version 2026.4.15
First, what does openclaw --version show?
On current builds, no, memory search only supports a single memorySearch.fallback string, not multiple ordered fallbacks.
Also, provider: "auto" uses a fixed internal order, not a configurable one:
local ā github-copilot ā openai ā gemini ā voyage ā mistral ā bedrock
(ollama is supported, but not auto-detected.)
So if you do not want Gemini or any other paid key to get picked up, the safest fix is to stop using auto and pin the provider explicitly, for example:
{
agents: {
defaults: {
memorySearch: {
provider: "local", // or "ollama", "openai", etc.
fallback: "none",
},
},
},
}
If you want exactly one fallback, that works too, for example provider: "local" plus fallback: "openai", but there is no multi-fallback chain.
Why Gemini is showing up: memory search reuses provider auth resolution, so if GEMINI_API_KEY or models.providers.google.apiKey exists, Gemini counts as available to auto-selection.
Docs: https://docs.openclaw.ai/reference/memory-config
If you paste your current memorySearch block plus openclaw --version, I can suggest the exact safest config.
Nice, thanks, 2026.4.15 matches what I checked, so the same answer applies on your version.
There is no multi-fallback list for memory search, and the auto order is fixed. If you want to prevent Gemini from being used, set an explicit provider and disable fallback, for example:
{
agents: {
defaults: {
memorySearch: {
provider: "ollama", // or "local", "openai", etc.
fallback: "none",
},
},
},
}
If you want one fallback only, you can do:
memorySearch: {
provider: "local",
fallback: "openai",
}
But there is no ["local", "openai", "voyage"] style ordered chain.
Also, auto will treat a configured Gemini key as available, so if models.providers.google.apiKey or GEMINI_API_KEY is present, it can pick Gemini.
Useful reference: https://docs.openclaw.ai/reference/memory-config
If you want, paste your current memorySearch block and Iāll rewrite it into the exact safe config for your setup.
which of the non-local providers have free tiers, and which ones do people typically use?
Quick snapshot from the current docs I checked, this stuff changes fast:
- Gemini: yes, Google still shows a free tier for embeddings (
gemini-embedding-001), plus paid tiers.
https://ai.google.dev/gemini-api/docs/pricing - Voyage: yes, and honestly this looks like the nicest non-local free tier right now. Their pricing page says the first 200M tokens are free on the main
voyage-4*models, and 50M on some others.
https://docs.voyageai.com/docs/pricing - Mistral: yes, they do have a free API tier, but their docs frame it as evaluation/prototyping.
https://docs.mistral.ai/deployment/ai-studio/tier - OpenAI: I would treat it as paid, not something to rely on for a stable free-tier memory setup.
- Bedrock: effectively paid/AWS, not a free-tier play.
- GitHub Copilot: Copilot Free exists, but I would not call it the simplest/most predictable choice for OpenClaw memory search unless you already use Copilot heavily.
For āwhat do people typically use?ā, my rough impression is:
- OpenAI, if they just want the boring reliable default
- Voyage, if they care specifically about embedding quality/cost
- Gemini, if they already have Google set up
- Bedrock, mostly AWS-heavy setups
- Mistral/Copilot, more niche for this use case
So if your goal is cheap non-local memory search, Iād personally look at Voyage first, or Gemini pinned explicitly if youāre okay with Google. I would not leave it on auto.
Also, since you asked about what people actually use in practice, itās worth asking in #users-helping-users too, thatāll get you real hands-on reports from other users.
If you want, I can give you 3 ready-to-paste configs:
- cheapest decent
- best free-ish option
- most reliable paid setup
I also noticed the default fallback chain doesn't work when the failure reason is rate limit. i.e. my gemini has no funds on it
Yep, that lines up with how itās implemented right now.
provider: "auto" is not a general failover chain. Itās more like provider discovery/selection. For Gemini specifically, auto-selection only keeps going on setup-style errors like missing auth, not on quota/rate-limit style failures.
So in practice:
- missing key / unavailable setup ā
automay continue to the next provider - 429 / quota exhausted / no funds ā it usually does not continue down the auto order
Thatās why āGemini has no fundsā doesnāt behave like āGemini is unavailableā.
Also, the explicit memorySearch.fallback is only one fallback, not a multi-hop chain.
So the safe advice today is:
- Donāt rely on
autofor quota/budget failover - Pin the primary provider explicitly
- Optionally set one explicit fallback
Example:
{
agents: {
defaults: {
memorySearch: {
provider: "voyage",
fallback: "openai", // or "none"
},
},
},
}
If you want OpenClaw to treat quota/rate-limit errors as ātry the next providerā, that sounds like a legit feature request / bug report for GitHub: https://github.com/openclaw/openclaw/issues
If you want, I can suggest a couple of concrete provider pairings depending on whether you want cheapest, best quality, or most predictable behavior.
Most straight forward way to install ollama as the fallback?
Easiest path is local-only Ollama as the explicit fallback, not auto.
Iād do this:
-
Install Ollama
https://ollama.com/download -
Make sure itās running
Usually the app does this, otherwise:ollama serve -
Set the placeholder key OpenClaw expects for local Ollama
export OLLAMA_API_KEY="ollama-local"
- Set memory search to use one real primary + Ollama fallback
{ agents: { defaults: { memorySearch: { provider: "voyage", // or openai/gemini/mistral fallback: "ollama", }, }, }, }
Thatās enough for the common localhost case. OpenClawās Ollama memory provider uses nomic-embed-text by default, and the docs say it will auto-pull the embedding model if it isnāt present yet.
Only if Ollama is on another host
Add an explicit provider entry, using the native Ollama URL, not /v1:
{
models: {
providers: {
ollama: {
baseUrl: "http://127.0.0.1:11434",
},
},
},
}
Important: use http://host:11434, no /v1.
Verify
openclaw memory status --deep
Docs:
Since youāre trying to survive quota/rate-limit issues, Iād definitely use explicit provider + fallback: "ollama" here, not auto.
If you want, I can give you a ready-to-paste config for Voyage ā Ollama or OpenAI ā Ollama.
what happens if my main provider and fallback both fail
Then it stops there, there isnāt a third hop.
For an explicit setup like:
memorySearch: {
provider: "voyage",
fallback: "ollama",
}
OpenClaw will:
- try the main provider
- try the single fallback
- if the fallback also fails, throw a combined error like:
<primary failure>
Fallback to ollama failed: <fallback failure>
So, practically:
- no extra fallback chain
- the memory/search operation for that attempt fails
- you get both failure reasons in the error
That behavior is coming from the memory embedding fallback code path directly.
If this helped, please right-click the solution message ā Apps ā ā Mark Solution so Answer Overflow can index it for future users.
For the semantic/vector part, yes, basically.
If your configured primary and single fallback both fail, OpenClaw does not have another provider to continue with, so the embedding-based memory path is effectively down until one of them works again.
Small nuance, there is an FTS-only degraded mode in the memory code, so plain keyword/text lookup can still exist in some cases, but that is not the same thing as working semantic memory search.
So the practical answer is:
- semantic memory search: no, not usable
- possible degraded keyword-only behavior: maybe, depending on state/indexing
- automatic third fallback: no
If you want robust behavior today, Iād treat it as:
āone primary, one fallback, after that memory search is degraded/unavailable.ā
Glad we caught that edge case. If this solved the thread, please right-click the solution message ā Apps ā ā Mark Solution so Answer Overflow can index it.
Yep, for Voyage memory embeddings the env var is VOYAGE_API_KEY.
The config-file equivalent is:
models: {
providers: {
voyage: {
apiKey: "VOYAGE_API_KEY",
},
},
}
And if you want Voyage as the memory provider:
agents: {
defaults: {
memorySearch: {
provider: "voyage",
},
},
}
do i need a placeholder for ollama key even though there isn't one?
Not a real key, no, local Ollama itself doesnāt need one.
But for OpenClaw, Iād still set the placeholder:
export OLLAMA_API_KEY="ollama-local"
Thatās the recommended low-friction setup in the Ollama docs for local-only / host-backed use. It helps OpenClawās provider setup/discovery behave predictably, even though Ollama isnāt actually checking the token.
So the short version is:
- real secret required? no
- placeholder recommended for OpenClaw? yes,
ollama-local
If you fully configure Ollama explicitly, you might get away without it, but Iād still use the placeholder to avoid weird edge cases.
once i have those in my .env i don't need any other openclaw.json changes other than memorysearch.provider and .fallback?
Yep, assuming:
VOYAGE_API_KEYis in your envOLLAMA_API_KEY=ollama-localis in your env- Ollama is local on the default URL (
http://127.0.0.1:11434)
then usually the only config you need is under agents.defaults.memorySearch, for example:
{
agents: {
defaults: {
memorySearch: {
provider: "voyage",
fallback: "ollama",
},
},
},
}
You only need extra config if one of these is true:
- Ollama is on another host/port
then setmodels.providers.ollama.baseUrl - you want a non-default Ollama embedding model
then setmemorySearch.model - you want custom base URLs / headers / explicit keys
then use the provider config ormemorySearch.remote.*
After changing .env, Iād restart the gateway and verify with:
openclaw memory status --deep
If this fixed it, please right-click the solution message ā Apps ā ā Mark Solution so Answer Overflow can index it.
can the agent test memory_search with a specific provider so we can make sure ollama is working as expected?
Yes, but there are really 2 different tests:
1. Raw Ollama embeddings smoke test
This checks that Ollama can actually create embeddings:
openclaw infer embedding create \
--model ollama/nomic-embed-text \
--text "memory search smoke test" \
--json
If that works, Ollama embeddings themselves are fine.
2. Real memorySearch test with Ollama
I donāt see a one-off memory status --provider ollama override, so the clean way is to temporarily pin memorySearch to Ollama:
{
agents: {
defaults: {
memorySearch: {
provider: "ollama",
fallback: "none",
},
},
},
}
Then restart the gateway and run:
openclaw memory status --deep
That should show Ollama as the requested/active provider, and whether embeddings are ready.
So my recommendation is:
- use
infer embedding createfor a quick provider smoke test - use temporary
memorySearch.provider = "ollama"+openclaw memory status --deepfor the real end-to-end memory test
If you want, I can give you a tiny ātest Ollama, then switch back to VoyageāOllama fallbackā workflow.
running into issues with voyage, what are some good test commands i can run
Yep, Iād test Voyage in this order:
1. Basic health
openclaw status
openclaw models status
2. Check embedding providers directly
openclaw infer embedding providers --json
3. Raw Voyage embedding smoke test
openclaw infer embedding create \
--provider voyage \
--text "memory search smoke test" \
--json
Thatās the cleanest ācan Voyage make embeddings at all?ā test.
4. Memory-search specific readiness
If you want to test memory search itself, temporarily isolate Voyage so it canāt silently fall through to Ollama:
memorySearch: {
provider: "voyage",
fallback: "none",
}
Then run:
openclaw memory status --deep
5. If it still fails, tail logs
openclaw logs --follow
My recommendation
If youāre debugging Voyage specifically, do both:
infer embedding create --provider voyage ...memory status --deep
That tells you whether the problem is:
- Voyage auth/provider setup
- or the memorySearch wiring on top of it
If you paste the output from these 3, I can probably pinpoint it fast:
openclaw models status
openclaw infer embedding create --provider voyage --text "memory search smoke test" --json
openclaw memory status --deep
inferring voyage gave me a response in about 30 seconds and completed in about 2 minutes, with an "ok: true"
but my claw was saying that memory_search was timing out
Yeah, that actually makes sense.
openclaw infer embedding create --provider voyage ... only proves Voyage can return an embedding eventually.
memory_search is tougher because it may also need to:
- sync/reindex files first
- chunk content into many embedding calls
- do query embeddings under a stricter timeout
In the current memory code, the built-in remote timeouts are roughly:
- query embedding: 60s
- batch embedding: 120s
So if Voyage is taking ~30s to first response and ~2 minutes end-to-end, itās very plausible that the live memory path hits those timeouts.
Good next tests
1. See whether it is query timeout or indexing timeout
openclaw memory status --deep --index --verbose
2. Force indexing outside the live chat path
openclaw memory index --force
Then try memory search again after the index is clean.
3. If indexing is the slow part, enable Voyage batch mode
Voyage supports memory batch embedding:
{
agents: {
defaults: {
memorySearch: {
provider: "voyage",
remote: {
batch: {
enabled: true,
timeoutMinutes: 10,
},
},
},
},
},
}
4. If search is getting stuck trying to sync on every search
Temporarily disable that:
memorySearch: {
sync: {
onSearch: false,
},
}
My guess
The likely issue is memory indexing/sync latency, not āVoyage is brokenā.
If you paste the output of:
openclaw memory status --deep --index --verbose
I can help tell whether itās:
- query timeout
- batch timeout
- sync/reindex doing too much work on search
- shows several "embeddings: batch start" and several "embeddings rate limited; retrying in x", and finalized with "memory index failed: fetch failed | other side closed"
Yep, thatās a really useful signal.
What it says to me is:
- Voyage auth is probably fine
- the failure is happening during bulk indexing
- youāre hitting rate limits, then the upstream connection is getting dropped (
fetch failed | other side closed)
So this looks much more like throughput/backfill pressure than āVoyage is brokenā.
What Iād try next
1. Stop doing sync work during live searches
memorySearch: {
sync: {
onSearch: false,
},
}
2. Turn on Voyage batch mode, and make it gentler
memorySearch: {
provider: "voyage",
fallback: "ollama",
remote: {
batch: {
enabled: true,
concurrency: 1,
timeoutMinutes: 10,
},
},
sync: {
onSearch: false,
},
}
Why:
enabled: trueuses the batch path Voyage supportsconcurrency: 1reduces burst/rate-limit pressuretimeoutMinutes: 10gives the batch job room to finish
3. Rebuild outside chat
openclaw gateway restart
openclaw memory index --force --verbose
openclaw memory status --deep
My read
Your issue is probably:
Voyage works for one-off embeddings, but memory indexing is generating enough requests that Voyage rate-limits it, then the connection gets closed.
If that config still ends with other side closed, Iād seriously consider either:
- using Ollama as the actual memory provider
- or filing a GitHub issue, because at that point it starts looking like a provider/runtime robustness bug rather than just config
If you want, I can turn that into one clean ready-to-paste memorySearch block for Voyage primary + Ollama fallback + safer indexing.
index --force is at 1%... eta is like a week hahahaha
oh wait no 300 minutes not hours, silly me
Hah, yeah, thatās the moment where Iād stop and not try to brute-force it š
A week ETA usually means Voyage is the wrong provider for your current memory corpus / rate limits, at least for the initial backfill.
Iād recommend one of these:
1. Best practical option: use Ollama for memory embeddings
If youāre okay with local embeddings, this is probably the sane path.
memorySearch: {
provider: "ollama",
fallback: "none",
sync: {
onSearch: false,
},
}
2. Keep Voyage, but shrink the scope a lot
Only index the minimum:
memorySearch: {
provider: "voyage",
fallback: "ollama",
sources: ["memory"], // no sessions
sync: {
onSearch: false,
},
// keep extraPaths very small or remove them for now
}
3. Donāt backfill everything
If youāve got a huge workspace / lots of session history, a remote embedding provider with rate limits is just going to be miserable here.
Important caveat
If you build the index with Ollama and later switch to Voyage, OpenClaw will need to reindex again anyway, because provider/model changes change the embedding space.
So Iād pick the provider you actually want to keep.
My honest recommendation: for memory search, Ollama is probably the better fit unless you have a pretty small corpus and decent Voyage limits.
If you want, I can help you trim the memory scope so itās usable instead of āsee you next weekā š
Hah, okay, 300 minutes is way less cursed š
Still pretty slow, but thatās at least āovernight-ishā instead of ānext geological eraā.
A couple of useful takeaways:
- first full index is the worst case
- once itās built, incremental updates should be much smaller
- if youāre still seeing rate limits /
other side closed, Iād still be a bit wary of Voyage for large reindexes
So my updated read is:
- if this is a one-time initial backfill and you donāt mind waiting, it might be acceptable
- if it keeps failing partway through, Iād still lean toward Ollama or a much smaller indexed scope
If you let it finish, Iād test again after that before changing too much. Thatāll tell you whether the real problem was just the initial bulk index.
it "timed out after 120s" but it also still got rate limited and switched to ollama half way through anyways
ohhh wait i'm reading it more, batch failed so it tried to not batch
Ah, thatās actually very telling, and honestly kind of useful.
What happened is basically:
- Voyage batch hit the built-in 120s timeout
- it was also getting rate limited
- OpenClaw treated that as an embedding/batch failure
- then it activated your explicit fallback and switched to Ollama
So yes, that lines up with the current fallback behavior.
The good news is, it shouldnāt leave you with a mixed embedding space. After fallback activation, memory does a forced safe reindex with the fallback provider.
Practical takeaway
If Voyage is consistently doing:
- rate limit
- 120s timeout
- then fallback to Ollama anyway
ā¦then I probably wouldnāt keep Voyage as the primary for memory search, because youāre basically paying a 120-second failure tax before ending up on the provider that works.
At that point Iād either:
Option A, simplest
Just use Ollama directly:
memorySearch: {
provider: "ollama",
fallback: "none",
}
Option B, if you still want to keep testing Voyage
Try the explicit batch config first:
memorySearch: {
provider: "voyage",
fallback: "ollama",
remote: {
batch: {
enabled: true,
concurrency: 1,
timeoutMinutes: 10,
},
},
}
But if it still ends up on Ollama, Iād just make Ollama primary and call it a day.
If this solved it, please right-click the solution message ā Apps ā ā Mark Solution so Answer Overflow can index it.
Yep, good catch, youāre reading that right.
There are really 2 different fallback layers happening:
-
batch embeddings fail
ā OpenClaw falls back to non-batch embeddings for the same provider -
if embeddings still keep failing after that
ā then it can fall back to your configured provider fallback (like Ollama)
So the flow is more like:
Voyage batch ā Voyage non-batch ā Ollama (if needed)
Also, batch gets effectively ādemotedā after repeated failures, it isnāt just blindly retried forever.
That means the real question is now:
- does Voyage non-batch work well enough for your corpus?
- or does it still rate limit / timeout badly enough that you end up on Ollama anyway?
If Voyage non-batch is stable, this may be less bad than it first looked. If it still collapses into Ollama regularly, Iād still just make Ollama primary for memory search.
Nice catch, that changes the interpretation a lot.
so from top to bottom, it tries to batch, fails, tries non-batch, rate limits, tries ollama, times out at 120s (and would take hours regardless).
can we figure out why voyage batch is failing and go from there?
Yeah, and I think thereās a naming trap here.
The log line you mentioned earlier, memory embeddings: batch start, is probably not Voyageās dedicated Batch API yet. That is the normal āsend multiple texts in one embeddings requestā path.
The actual Voyage Batch API path should log things more like:
memory embeddings: voyage batch submitmemory embeddings: voyage batch created- batch status / waiting messages with a batch id
So first Iād verify which batch path youāre actually on.
Clean way to test real Voyage Batch API
Temporarily isolate it:
memorySearch: {
provider: "voyage",
fallback: "none",
sync: {
onSearch: false,
},
remote: {
batch: {
enabled: true,
concurrency: 1,
wait: true,
pollIntervalMs: 2000,
timeoutMinutes: 10,
},
},
}
Then restart and run:
openclaw memory status --deep --index --verbose
openclaw logs --follow
What to look for
If you see:
voyage batch created- a batch id
- status polling
Then youāre on the real Voyage Batch API path.
If you only see:
memory embeddings: batch startrate limitedtimed out after 120s
Then youāre still on the regular embedBatch path, not the dedicated Voyage Batch API.
Why this matters
Those are different failure modes:
- regular embedBatch path failing means the normal
/embeddingsroute is too slow / rate-limited - Voyage Batch API failing would point to file upload, batch creation, status polling, or output retrieval
So yes, we can absolutely figure it out, but step 1 is making sure weāre testing the actual Voyage Batch API and not just the generic batched embeddings path.
If you want, once you run that, paste the first few voyage batch ... lines (or confirm they never appear), and I can tell you exactly which path is breaking.
ah... gotta add a payment method to the account to unlock the true "free tier"
recommended usage limit so i don't get charged and maintain the free tier?
Yep, thatās the classic āfree tierā gotcha š
That would explain a lot of the weirdness, especially if Voyage is keeping you on very constrained limits until billing is set up.
So at this point the choice is basically:
- donāt add a payment method ā use Ollama for memory embeddings
- add a payment method ā retest Voyage with real limits, ideally with the actual batch API enabled
Honestly, if your goal is just āmake memory search work reliably,ā Ollama is still the lower-drama option.
If this thread is basically solved now, please right-click the solution message ā Apps ā ā Mark Solution so Answer Overflow can index it for future users.
I donāt think thereās really a āset this limit and keep the free tierā knob.
From Voyageās docs, the model is more like:
- add a payment method to unlock the higher tier / normal limits
- your free tokens still apply first
- after that, usage is billed
So the safest answer is:
- if you want guaranteed $0 spend, stay under the free token allotment
- I do not currently see a documented hard spend cap / billing limit in their docs
- thereās even a docs forum request asking for that feature, which is⦠not very comforting š
For the models we were talking about, their pricing docs currently say:
- 200M free tokens on the main
voyage-4*family
- 50M free tokens on some other models
So if you want the conservative answer: do not plan to cross the free-token threshold at all.
If you want the practical answer: Iād personally either:
- use Voyage only for small tests and keep a close eye on usage, or
- use Ollama if the goal is āabsolutely no surprise billā
Docs:
okay it's working now, i'm watching it batch, and carefully watching my free tier usage so i don't get lambasted
and after the index is batched it should be light usage from there?
is seeing voyage batch batch-xxxxx in_progress waiting 2000ms over and over correct?
Yep, that part is normal.
That log line is just the poll loop for Voyageās Batch API:
- batch created
- status =
in_progress - wait
2000ms - check again
- repeat until it becomes
completed(or hits the timeout)
So:
- seeing it a few/many times: normal
- seeing it forever until timeout: not normal, that means the batch is just taking too long / stuck / queued on Voyageās side
The key question is whether it eventually changes from in_progress to completed.
If it never does and then dies on timeout, Iād increase timeoutMinutes and keep concurrency: 1, or just accept that Voyage batch is too slow for this corpus.
i see "finalizing" and "validating" now
Nice, thatās actually a good sign.
validating and finalizing mean the batch moved past the plain in_progress stage and Voyage is wrapping it up.
So the rough progression is:
in_progressvalidatingfinalizingcompleted
If it reaches completed, then the Batch API path itself is working, and the issue shifts from ābatch is brokenā to āis it fast enough / practical enough for your index sizeā.
Thatās progress š
it only needs to do this once though, right? unless at some point i need to re-index
Yeah, mostly once for the big initial backfill.
After that, normal file/session changes should usually be incremental, not ādo the whole universe againā.
Youāll typically need a full reindex again if you change things like:
- provider (
voyageāollama, etc.) - model
outputDimensionality- indexed scope (
sources,extraPaths, multimodal settings) - chunking settings
- some index/storage settings like the FTS tokenizer
- or if you manually run
openclaw memory index --force
So the short version is:
- initial setup: big expensive one-time job
- normal use: incremental updates
- config/provider/model changes: full reindex again
So if this Voyage batch finishes successfully, you should not have to sit through that exact full backfill again unless you change something important.
and if it is less than 200 million tokens (unlikely) for the re-index then it should be good for usage, unless they change their free policy
Yep, in principle thatās the idea.
If youāre on one of the Voyage models with the 200M free-token allotment, and your total embedding usage stays under that, then based on the current pricing docs you should be okay.
Main caveats:
- the free allotment is model-specific
- itās total usage, not just this one reindex
- ongoing updates / future reindexes also count
- and yeah, Voyage could always change the policy later
So Iād phrase it as:
āIf my memory indexing + ongoing embedding usage stays below the current free-token allotment for this model, I should avoid charges.ā
Thatās the cautious version š
ohhh right because the 200 million doesn't reset per month or year, it's forever
Yep, thatās how I read their wording too.
āThe first 200 million tokens are free for every accountā reads like a one-time cumulative allotment, not a monthly/yearly reset.
So unless their docs say otherwise somewhere else, I would assume:
- 200M total free once
- after that, paid
I definitely would not plan around it resetting.
batching is back to "in_progress" again, is that normal?
If itās a new batch id, yep, thatās normal.
OpenClaw can split the work into multiple Voyage batch groups, so you might see:
- batch A ā
in_progressāvalidatingāfinalizingā done - then batch B ā back to
in_progress
That part is fine.
If itās the same exact batch id going from finalizing back to in_progress, that would be weirder, and Iād want to look closer at the logs.
So the key question is: same batch id, or a new one?
new one i think
Then yeah, thatās normal š
That just means the index job got split into another Voyage batch group, so youāre watching the next chunk of work start.
So in that case:
- old batch finished / moved on
- new batch started
- status goes back to
in_progress
Annoying to watch, but not a bad sign by itself š
sooo how long should i expect this to go for about a month of chatting and debugging openclaw stuff?
For just about a month of chat/debugging, my gut check is:
- tens of minutes to a few hours is believable
- days would feel wrong
- half a day+ usually means youāre indexing more than just chats, or Voyage limits are the bottleneck
What really changes it:
- whether youāre indexing only session transcripts or also files /
extraPaths - how large the chats are
- how much code/log paste is in them
- Voyageās current queue / rate limits
So if this is truly just āa month of OpenClaw support/debug chatsā, Iād expect hours, not days.
The good news is: once this initial backfill is done, future updates should be way smaller.
my only extra path is openclaw docs
Ah, okay, that explains it a lot more.
The OpenClaw docs tree is not tiny, so:
- one month of chats/debug logs
- plus the docs directory
- chunked into lots of embedding inputs
ā¦can absolutely turn the first backfill into an hours-scale job on Voyage.
So that sounds annoying but plausible, not obviously broken.
If your main goal is conversation recall, Iād honestly consider:
- first indexing just sessions / memory
- then adding the docs path later if you still want it
Because the docs are probably a big chunk of what youāre paying for in time here.
So yeah, with extraPaths = [openclaw docs], Iām much less surprised by the runtime.
but surely it's not 50 million or 200 million tokens
Yeah, Iād be pretty surprised if one month of chats + the OpenClaw docs were anywhere near 50M, let alone 200M tokens.
So I wouldnāt interpret the long runtime as āyou must be chewing through hundreds of millions of tokensā.
Itās much more likely a mix of:
- batch queueing
- rate limits
- chunking overhead
- retries / polling
- just the general slowness of remote embedding backfills
So:
- slow does not automatically mean huge token count
- your corpus can be pretty normal-sized and still take a while if the provider is throttling
That said, repeated reindexes over time are what would start to make me care about the 50M / 200M numbers.
Okay because it's late and I may need to let it go overnight haha. We're at 120k tokens right now
Yeah, that sounds totally reasonable.
And 120k tokens is tiny relative to 50M / 200M, so from a usage-cost perspective thatās not scary at all.
If itās late, Iād honestly just let it run overnight and check in the morning with:
openclaw memory status --deep
and maybe a quick:
openclaw memory search "some topic you know is in your chats"
If it finishes cleanly, youāre probably in good shape. If it dies overnight, the new logs should be much more informative.
uh oh i just saw the same batch id go from "validating" to "in_progress"
Yeah, thatās definitely weirder than the clean happy path.
I wouldnāt call it instantly fatal, but a same batch id going:
validating ā in_progress
suggests one of:
- Voyageās status reporting is a bit flaky / non-monotonic
- the job got re-queued or re-opened on their side
- the status API is lagging / inconsistent
So my read is:
- not ideal
- probably provider-side weirdness
- if it still eventually finishes, I wouldnāt panic
- if it keeps bouncing like that and then times out, thatās much stronger evidence the Voyage batch path is flaky here
If itās late, Iād probably let it keep going a bit, but if the same id keeps oscillating repeatedly, Iād save that batch id because thatās great bug-report evidence.
welp problem solved i accidentally CTRL-C'd to copy forgetting i was in linux terminal so it ended the process
Hah, yep, that would do it š
That actually explains the weirdness a lot better than āVoyage invented a new batch state machine.ā
Iād just rerun one of these:
openclaw memory status --deep --index
or, if you want the cleanest possible rebuild after the interruption:
openclaw memory index --force
Given the interrupted run, Iād personally lean --force if youāre okay letting it churn again.
And yeah, Linux terminal copy muscle memory gets everybody eventually. If this thread is basically solved now, please right-click the solution message ā Apps ā ā Mark Solution so Answer Overflow can index it.
i think i'll just switch to ollama so i don't have to deal with voyage potentially charging me in the future. what's the best config look like for that
i recall it was timing out after 120s last time
Yeah, honestly, thatās the lower-drama move.
Best simple config for local Ollama memory search
.env
OLLAMA_API_KEY=ollama-local
openclaw.json
{
agents: {
defaults: {
memorySearch: {
provider: "ollama",
fallback: "none",
model: "nomic-embed-text",
sync: {
onSearch: false, // optional, but I recommend it
},
},
},
},
}
Thatās the config Iād start with if:
- Ollama is running locally
- itās on the default URL (
http://127.0.0.1:11434) - you want Ollama only for memory embeddings
Only add this if Ollama is on a custom host/port
{
models: {
providers: {
ollama: {
baseUrl: "http://127.0.0.1:11434",
},
},
},
}
Important: use the native Ollama URL, not /v1.
After changing provider
Since this is a provider switch, youāll want a fresh reindex:
openclaw gateway restart
openclaw memory index --force
My recommendation
Iād keep:
- your current
sources - your current
extraPaths
and just swap the provider over to Ollama first.
If you want, I can give you one ready-to-paste full config block for:
- sessions + docs
- Ollama only
- safe defaults for overnight indexing
Yep, good thing to keep in mind.
One subtle gotcha: switching to Ollama does not automatically mean a bigger timeout.
In the current memory code, the longer ālocalā timeout is for the special memorySearch.provider = "local" path, not for ollama. So with Ollama, youāre still generally relying on the normal remote-style memory timeouts.
So the bet with Ollama is not āhigher timeoutā, itās:
- no Voyage rate limits
- no provider queueing
- usually much faster actual response time
So if your Ollama instance is reasonably fast, it should still be a lot better. But if your local machine is really slow, yes, it could still hit timeouts.
That said, Iād still expect Ollama to be much less annoying than Voyage for this use case.
If you want the cautious path, switch to Ollama and do a small smoke test first:
openclaw infer embedding create \
--model ollama/nomic-embed-text \
--text "memory search smoke test" \
--json
Then run the full reindex.
any memorysearch.remote settings i can use?
Yep, there are a few.
memorySearch.remote settings
memorySearch: {
remote: {
baseUrl: "...",
apiKey: "...",
headers: { ... },
batch: {
enabled: true,
wait: true,
concurrency: 1,
pollIntervalMs: 2000,
timeoutMinutes: 10,
},
},
}
What they do
-
remote.baseUrl
override the embedding endpoint base URL
useful for Ollama on a custom host/port or custom OpenAI-compatible endpoints -
remote.apiKey
override the key just for memory search -
remote.headers
add extra HTTP headers
-
remote.batch.enabled
enable the providerās async batch API -
remote.batch.wait
wait for batch completion vs fire-and-poll behavior -
remote.batch.concurrency
how many batch jobs run in parallel -
remote.batch.pollIntervalMs
how often to poll batch status -
remote.batch.timeoutMinutes
how long to wait for the async batch job before timing out
Important caveat
For Ollama, the practically useful ones are mostly:
baseUrlapiKeyheaders
The batch settings are mainly relevant for providers like Voyage, Gemini, and OpenAI that expose async batch embedding support.
Also, memorySearch.remote.* does not give you a knob for the generic non-batch query timeout.
Example for Ollama
{
agents: {
defaults: {
memorySearch: {
provider: "ollama",
fallback: "none",
model: "nomic-embed-text",
remote: {
baseUrl: "http://127.0.0.1:11434",
},
},
},
},
}
If you want, I can give you the exact memorySearch block Iād use for local Ollama only.
so no batch settings for nomic?
Right, basically no.
For Ollama + nomic-embed-text, the memorySearch.remote.batch.* knobs are not the useful path in current OpenClaw.
Those async batch settings are for providers that expose a dedicated batch embedding flow, like Voyage, Gemini, and OpenAI.
For Ollama/nomic, the useful knobs are basically:
memorySearch.providermemorySearch.modelmemorySearch.remote.baseUrlmemorySearch.remote.apiKeymemorySearch.remote.headers
So if you switch to Ollama, Iād stop thinking about batch tuning and just focus on:
- local Ollama running cleanly
- correct
baseUrl - whether itās fast enough on your machine
If you want, I can give you the cleanest Ollama + nomic-embed-text config block to paste in.
how do i fix the 120s timeout? it's going to take a lot longer than that
Short answer: for Ollama/nomic, I donāt think you can fix that with config today.
There are 2 different cases:
1. If youāre using a provider with async batch support
Like Voyage / OpenAI / Gemini, then yes, you can raise it with:
memorySearch: {
remote: {
batch: {
enabled: true,
timeoutMinutes: 10,
},
},
}
2. If youāre using Ollama + nomic-embed-text
That 120s limit is on the normal non-batch embedding path, and I donāt see a memorySearch.remote.* knob for changing that.
So for Ollama, your options are basically:
- reduce the indexing scope
- use a faster machine / faster Ollama setup
- use the special
memorySearch.provider = "local"path instead ofollamaif you want the longer built-in local timeout behavior - or file a GitHub issue asking for configurable non-batch memory embedding timeouts: https://github.com/openclaw/openclaw/issues
So the honest answer is:
Voyage batch timeout is configurable. Ollama/nomic timeout is not, at least not from normal memorySearch config.
If you want, I can help you decide whether ollama or local is the better fit for your setup.
Can local point at ollama? Or is it something else entirely. What's the max timeout on local? It may still not be high enough
local is something else entirely, it does not point at Ollama.
local vs ollama
ollama= talk to a running Ollama server over HTTPlocal= OpenClawās built-in local embedding path, using a GGUF model vianode-llama-cpp
So local is more like:
- download / point at a GGUF embedding model
- OpenClaw runs it directly
It is not:
baseUrl: http://localhost:11434- or āOllama but localā
Current built-in local timeouts
From the current memory code, local gets longer built-in limits:
- query: about 5 minutes
- batch: about 10 minutes
Thatās better than the normal remote-style 60s / 120s path.
Important caveat
I still donāt see a normal config knob to push those even higher.
So if 10 minutes for a batch still wonāt cut it, then honestly the bigger issue may be:
- corpus too large for the chosen local setup
- model/machine too slow
- or memory scope too broad
Practical takeaway
- if you want Ollama, use
provider: "ollama" - if you want the longer built-in timeout, use
provider: "local"
- but
localmeans a GGUF model, not Ollama
Docs: https://docs.openclaw.ai/reference/memory-config
If you want, I can show you the cleanest memorySearch.provider = "local" config too.
This is certainly a pickle. Memory search is a core feature for agents to use, but I can't seem to get it to be available for truly free. It'd be one thing if i could use the same indexes for different providers but you can't. Getting a better machine is not an option and shrinking the memory scope defeats the purpose
Yeah, I think youāre hitting a real constraint, not missing some secret config.
Today, I donāt think thereās a magic combo that gives you all 4 at once:
- semantic memory search
- large scope
- truly free
- no decent local hardware
Thatās the frustrating tradeoff.
So the realistic options are basically:
-
Remote embeddings
good semantic recall, but potential cost / rate limits -
Local/Ollama/local GGUF
free-ish, but performance/hardware constrained
- FTS-only memory
truly free, same broad scope, but keyword/lexical instead of semantic/vector
Honestly, if āfree + big scopeā matters more than perfect semantic recall, the interesting compromise might be FTS-only. Itās weaker, but it avoids both the provider-cost problem and the hardware problem.
So my blunt take is:
there is no perfect free semantic-memory path here right now, but there may be a usable free FTS-only path.
If you want, I can help you test whether FTS-only memory search is good enough for your setup. That might actually be the least painful answer.
Help me get local setup step by step tomorrow, please, (and uninstall ollama lol)
Failing that help me use fts
Yep, absolutely, I can help you do that tomorrow.
And honestly, Iād not uninstall Ollama yet. Iād just stop using it for memory search for now. Itās easier to compare paths if itās still there, and uninstalling adds more chaos than value tonight.
Tomorrowās plan
Plan A, get memorySearch.provider = "local" working
Weāll do it step by step:
- confirm the install/runtime state
- check whether the local embedding runtime is actually available
- switch memory search to
local - run a clean reindex
- test recall
Plan B, if local is still too painful
We pivot to FTS-only and make sure keyword memory search is at least usable.
When youāre back tomorrow
Send me these:
openclaw --version
openclaw status
openclaw memory status --deep
And tell me:
- are you running OpenClaw via npm or pnpm
- what OS youāre on
- whether you want to keep the docs path indexed
Then Iāll walk you through the local setup first, and if itās still cursed, weāll switch to FTS.
Get some sleep š this is definitely tomorrow-problem territory.
okay let's set up local memory search. 2026.4.15, debian 13, yes keep the docs path, npm
Yep. Letās do the smallest safe local setup first and keep your docs path.
Docs Iām basing this on:
https://docs.openclaw.ai/reference/memory-config
https://docs.openclaw.ai/concepts/memory-search
https://docs.openclaw.ai/start/faq
Step 1, switch memory search to local
openclaw config set agents.defaults.memorySearch.provider local
openclaw config set agents.defaults.memorySearch.fallback none
You do not need to set local.modelPath yet.
If unset, OpenClaw should use the default local embedding model:
embeddinggemma-300m-qat-Q8_0.gguf
- about 0.6 GB
- auto-downloaded on first use
Step 2, restart the gateway
openclaw gateway restart
Step 3, check whether local embeddings are actually ready
openclaw memory status --deep
What I want to see is basically:
- provider =
local - no
node-llama-cppmissing error
- no ālocal embeddings unavailableā error
Step 4, if that looks good, start a clean reindex
openclaw memory index --force
Since youāre changing providers, a fresh rebuild is the right move.
Step 5, test it
After indexing, try a couple searches:
openclaw memory search "gateway token field in control ui"
openclaw memory search "voyage timeout 120s"
Important note
The first local run may spend a bit downloading/preparing the GGUF model, so donāt panic if it doesnāt instantly jump into indexing.
If Step 3 fails
Paste the output of:
openclaw memory status --deep
That will tell us whether the problem is:
- missing
node-llama-cpp - model download/setup
- or something else runtime-related
Start with Steps 1 to 3 and paste the status output before you kick off a huge reindex. Thatāll save you pain.
does it auto download on gateway restart or on index or on memory status?
Not on gateway restart.
It should happen on the first actual embedding call, so in practice usually one of these:
openclaw memory status --deepopenclaw memory index --force- a memory search that needs embeddings
So the safest expectation is:
- restart: no download yet
- plain
openclaw memory status: probably no download - deep status / index / real embed probe: yes, thatās when it should kick off
Thatās why I wanted you to do openclaw memory status --deep before a huge reindex, itās a nice smaller ādoes local actually wake up?ā test.
so what's the full config section for memorySearch look like, including any ideal remote.batch or other settings?
For local, Iād keep it pretty clean. The big thing is: donāt bother adding remote.batch here, it wonāt help the local provider.
Recommended memorySearch block for your case
{
agents: {
defaults: {
memorySearch: {
enabled: true,
// Force fully local embeddings
provider: "local",
fallback: "none",
// Keep transcript recall
experimental: {
sessionMemory: true,
},
sources: ["memory", "sessions"],
// Keep your docs path
extraPaths: [
"/home/YOUR_USER/openclaw-ref/docs", // or whatever docs path you're already using
],
// Easiest starting point: use the default auto-downloaded GGUF
local: {
// modelPath: "hf:ggml-org/embeddinggemma-300m-qat-q8_0-GGUF/embeddinggemma-300m-qat-Q8_0.gguf",
// modelCacheDir: "/home/YOUR_USER/.cache/node-llama-cpp",
},
// Helps avoid re-embedding unchanged content later
cache: {
enabled: true,
maxEntries: 50000,
},
// Good default ranking setup
query: {
hybrid: {
enabled: true,
vectorWeight: 0.7,
textWeight: 0.3,
mmr: {
enabled: true,
lambda: 0.7,
},
temporalDecay: {
enabled: true,
halfLifeDays: 30,
},
},
},
// Avoid surprise āsearch triggered a giant syncā latency
sync: {
onSearch: false,
sessions: {
deltaBytes: 100000,
deltaMessages: 50,
},
},
},
},
},
}
Notes
remote.batch: skip it forlocalfallback: "none": avoids quietly jumping to a paid providercache.enabled: true: worth itonSearch: false: I recommend this, especially while youāre stabilizing things
After saving config
openclaw gateway restart
openclaw memory status --deep
If that looks healthy, then:
openclaw memory index --force
If you ever go back to a remote provider
Then the only batch block Iād use is:
remote: {
batch: {
enabled: true,
wait: true,
concurrency: 1,
pollIntervalMs: 2000,
timeoutMinutes: 30,
},
}
But again, that is for Voyage/OpenAI/Gemini, not local.
If you want, paste your current memorySearch block and Iāll rewrite it into the exact final version for your setup.
memory status --deep showed failure, local embeddings unavaiable, optional dependency node-llama-cpp missing or failed to install
Okay, thatās actually useful. Your config is no longer the problem. The issue is just that the local runtime dependency never installed cleanly.
Since youāre on npm, Iād do this next:
1. Check Node version
node -v
For local embeddings, Iād really prefer:
- Node 24 for installs/updates
- or Node 22.14+
2. Check whether npm is skipping install scripts
npm config get ignore-scripts
If that says true, thatās likely the culprit.
3. Reinstall OpenClaw globally
npm install -g openclaw@latest
4. Restart gateway
openclaw gateway restart
5. Re-test local embeddings
openclaw memory status --deep
What I want you to paste back
Please paste:
node -vnpm config get ignore-scripts- and the result of the fresh
openclaw memory status --deep
If reinstall still fails, paste the npm install -g openclaw@latest error output too.
Thatāll tell us whether this is:
- wrong Node version
- npm scripts being skipped
- or a native build failure during install
At least weāve got the problem narrowed down now š
i'm on node 24 and ignore-scripts is false
will the reinstall hurt my current install at all?
Good, that rules out the two easy culprits.
So the next most likely issue is simply: your current global OpenClaw install does not have node-llama-cpp present.
Next step
Reinstall OpenClaw globally:
npm install -g openclaw@latest
Then:
openclaw gateway restart
openclaw memory status --deep
If it still says node-llama-cpp missing
Run this too:
npm ls -g openclaw node-llama-cpp --depth=0
and paste the output.
Thatāll tell us whether:
node-llama-cppwas actually installed- or OpenClaw is there but the optional dep still isnāt landing
So yep, at this point Iād do the reinstall first.
It shouldnāt hurt your actual data/config, no.
A normal global reinstall should not wipe:
- your
openclaw.json - your workspace
- your session/memory data in
~/.openclaw
What it does affect is basically:
- the global CLI package files
- bundled/optional dependencies like
node-llama-cpp
So the normal risk is more like:
- brief CLI / gateway downtime
- install fails and you need to rerun it
- version/package contents change
Not usually:
- āall my OpenClaw state is goneā
If you want the belt-and-suspenders version, back up your config first, but I would not expect reinstalling openclaw@latest to nuke your current setup.
where should it install global npm packages by default
It depends on your npm global prefix.
On Linux, the common defaults are:
-
system Node ā usually
/usr/local/lib/node_modules
with binaries in
/usr/local/bin -
nvm-managed Node ā usually something like
~/.nvm/versions/node/v24.x.x/lib/node_modules
with binaries in
~/.nvm/versions/node/v24.x.x/bin
Best way to check your actual machine:
npm prefix -g
npm root -g
which openclaw
Those will tell you:
- the global prefix
- the global package dir
- which
openclawbinary youāre actually using
If you paste those, I can sanity-check whether your install path looks right.
okay so i did everything you said and it's the same error. cannot find package node-llama-cpp imported from (my global npm folder)/openclaw/dist/engine-embeddings-bkxxxxx.js (file does exist)
Ah, I found the real reason, and I need to correct my earlier advice.
node-llama-cpp is currently an optional peer dependency, not something openclaw necessarily installs for you automatically. So reinstalling OpenClaw alone can leave you in exactly the same broken state.
So the fix is to install it explicitly into the same global npm prefix:
npm install -g node-llama-cpp@3.18.1
Then:
openclaw gateway restart
openclaw memory status --deep
Why this happens
OpenClaw is trying to import:
node-llama-cpp
from inside its global install, but that package is not actually present in the global module path.
If you want to verify after install
npm ls -g openclaw node-llama-cpp --depth=0
npm root -g
You should see both:
openclawnode-llama-cpp
in the same global tree.
So yeah, the missing piece is not āreinstall OpenClaw againā, itās:
install
node-llama-cppexplicitly
That should get us a lot closer.
does it have to be that version or can it be latest
Iād use the exact version first.
OpenClaw currently declares:
node-llama-cpp: 3.18.1
as the peer dependency, so the safest move is:
npm install -g node-llama-cpp@3.18.1
Could latest work? Maybe. But if latest has API/packaging changes, youāll just introduce another variable.
So my recommendation is:
- for debugging / first success ā use
3.18.1
- once it works, you can experiment later if you really want
I wouldnāt use latest for the first fix.
my npm installs complain about an unsupported npmrc setting, "onlyBuiltDependencies[]=openclaw" is that important?
Nope, not important for npm.
onlyBuiltDependencies[]=openclaw is basically a pnpm-specific setting. npm doesnāt support it, so it just complains and ignores it.
So for your case:
- annoying warning: yes
- cause of the
node-llama-cppmissing error: no
The thing that matters is still whether node-llama-cpp actually ends up installed in the same global npm tree as openclaw.
So Iād treat that npmrc warning as noise, not the root problem.
okay memory --deep is downloading stuff now
Nice, thatās actually the first really good sign š
That means:
node-llama-cppis being found now- the local provider is finally waking up
- itās likely pulling the default GGUF model / runtime bits
So at this point Iād just let it finish and not interrupt it.
Best case, after the download completes, openclaw memory status --deep comes back healthy.
Then the next step is:
openclaw memory index --force
If memory status --deep still errors after the download, paste the new error, but this is definitely progress.
oh it did say additional memory path missing. it looks like it's prepending ~/.openclaw/workspace to the string in that setting automatically, so i'd just need "docs" instead of ~/.openclaw/workspace/docs
Yep, that makes sense, and good catch.
In config, extraPaths supports:
- absolute paths
- workspace-relative paths
But ~ is not shell-expanded there.
So:
docsā resolved as~/.openclaw/workspace/docsif thatās your workspace~/.openclaw/workspace/docsā gets treated like a relative string and ends up wrong/home/youruser/.openclaw/workspace/docsā true absolute path, also fine
So yes, if the docs are inside your workspace, just using:
extraPaths: ["docs"]
is the clean fix.
And since changing extraPaths changes indexing scope, expect a full reindex after that.
anything you want to see from memory status --deep before i kick off index --force?
Yep, just a few things.
Before index --force, Iād want memory status --deep to show no obvious red flags around:
- provider =
local - embeddings probe = ok / ready
- vector search available
- no
node-llama-cpperror - no āadditional memory path missingā warning
- your docs path resolving the way you expect
If those all look clean, Iād go ahead with:
openclaw memory index --force
If you want, paste the memory status --deep output and Iāll do a quick sanity check before you kick it off.
embeddings probe did mention that the prebuilt binary for platform linux x64 with vulkan support is not compatible with the current system, falling back to using no GPU.
but i think that's just info. everything else looks clean
Yep, that sounds like informational fallback, not a blocker.
It basically means:
- the Vulkan-flavored prebuilt didnāt fit your system
node-llama-cppfell back to CPU- local embeddings should still work, just slower
So if everything else looks clean, Iād go ahead with:
openclaw memory index --force
That warning only becomes important if indexing is painfully slow, but it shouldnāt stop the run.
this is a 14-year old thinkpad with 8gb of ram, i think you'd be surprised how slow it might be
Hah, okay, fair, I retract my optimism š
On a 14-year-old ThinkPad with 8 GB RAM, CPU-only local embeddings could absolutely be glacial.
So Iād frame this run as a viability test, not āthis will definitely be pleasantā.
My honest expectation:
- it might work
- it may take quite a while
- and if the machine starts suffering, FTS-only may still end up being the saner long-term answer
Iād still let index --force try, but with very modest expectations. If it starts looking cursed, we can pivot fast instead of pretending this laptop is a datacenter.
the jet engine has spun up
Yep, thatās the authentic local-AI startup sound š
At this point Iād say:
- let it cook
- avoid doing other heavy stuff on the machine
- if it actually finishes, that ThinkPad earns a medal
If it stalls, OOMs, or takes forever, weāll call it and pivot to FTS-only without shame.
my only concern is some baked in timeout. eta is only 25 minutes but didn't you mention 5 or 10 minute timeout the other day?
Yep, but the important distinction is:
- 25 minutes total index time can be totally fine
- the 5 / 10 minute timeout is for an individual embedding operation / batch, not the entire end-to-end index run
So if indexing is marching along chunk by chunk, a 25-minute overall ETA is not automatically a problem.
The timeout only bites if something like:
- one query embed takes over ~5 minutes
- or one batch embed step takes over ~10 minutes
So Iād watch for:
- steady progress = probably okay
- one step hanging forever = thatās the danger zone
In other words, long overall runtime is okay, stuck single operation is not.
i mean ollama was like 300 minutes so this is way better, although it IS going up a bit now
Oh, then yeah, this is much more promising.
If local is showing tens of minutes instead of 300 minutes, thatās a huge improvement.
And the ETA creeping upward a bit is pretty normal, especially when:
- the early sample was too optimistic
- chunk sizes vary
- the machine starts heat/throttle juggling on CPU
So Iād read this as:
- not fast
- but plausibly usable
- and definitely better than the Ollama path on this machine
Honestly, Iām encouraged by that.
i see it says indexing memory files 7/52, but memory status showed "indexed 30/211 files". so 52 vs 211? is that actual memory files vs session files? does that mean node-llama won't be doing session files?
Yep, thatās almost certainly whatās happening.
52= the current memory-file phase211= the broader total frommemory status, likely memory + sessions combined
So no, that does not mean node-llama-cpp wonāt handle session files.
More likely the flow is:
- Indexing memory files (
7/52) - then later Indexing session files
- and
memory statusis showing the bigger combined picture
So the key thing to watch is whether it later switches to a label like:
Indexing session filesā¦
If it does, that confirms sessions are part of the run too.
If you have sources: ["memory", "sessions"], local embeddings should apply to both.