#Custom embedding target

1 messages · Page 1 of 1 (latest)

north fog
#

I want to use my own embedding server that is an adapter to pinecone.

node@ed58f547b964:/app$ openclaw memory status

🩞 OpenClaw 2026.4.16 (unknown)
Give me a workspace and I'll give you fewer tabs, fewer toggles, and more oxygen.

[openclaw] Failed to start CLI: Error: Unknown memory embedding provider: pineconeadapter
at getAdapter (file:///app/dist/manager-B8ruQ6F5.js:143:22)
at createEmbeddingProvider (file:///app/dist/manager-B8ruQ6F5.js:195:25)
at MemoryIndexManager.loadProviderResult (file:///app/dist/manager-B8ruQ6F5.js:2577:16)
at file:///app/dist/manager-B8ruQ6F5.js:2682:52
at MemoryIndexManager.ensureProviderInitialized (file:///app/dist/manager-B8ruQ6F5.js:2690:5)
at MemoryIndexManager.probeVectorAvailability (file:///app/dist/manager-B8ruQ6F5.js:3032:14)
at Object.run (file:///app/dist/cli.runtime-DDVmFMoY.js:449:25)
at withManager (file:///app/dist/cli-utils-PbJDKJWK.js:10:16)
at async withMemoryManagerForAgent (file:///app/dist/cli.runtime-DDVmFMoY.js:253:2)
at async Module.runMemoryStatus (file:///app/dist/cli.runtime-DDVmFMoY.js:403:34)

config:
markus@staropramen ~/.openclaw % jq -r .agents.defaults.memorySearch ~/.openclaw/openclaw.json
{
"sources": [
"memory",
"sessions"
],
"experimental": {
"sessionMemory": true
},
"provider": "pineconeadapter",
"fallback": "none",
"model": "llama-text-embed-v2",
"query": {
"hybrid": {
"enabled": true,
"temporalDecay": {
"enabled": true,
"halfLifeDays": 30
}
}
},
"cache": {
"enabled": true,
"maxEntries": 50000
}
}

markus@staropramen ~/.openclaw % jq -r .models.providers.pineconeadapter ~/.openclaw/openclaw.json
{
"baseUrl": "http://pineconeadapter:11434/v1",
"apiKey": "env:PINECONEADAPTER_API_KEY",
"api": "openai-completions",
"models": []
}

elfin aspenBOT
#

Just checking the source path that controls memory embedding providers, then I’ll give you the concrete config fix.

#

Yep, the failure makes sense.

agents.defaults.memorySearch.provider is not looking at arbitrary models.providers.<id> entries. It expects a registered memory embedding adapter ID.

So this:

memorySearch: {
  provider: "pineconeadapter"
}

fails because pineconeadapter is not a known embedding provider in 2026.4.16.

What to do instead

If your server exposes an OpenAI-compatible /v1/embeddings endpoint, use the built-in openai embedding provider and point memory search at your adapter with memorySearch.remote:

#
{
  agents: {
    defaults: {
      memorySearch: {
        provider: "openai",
        model: "llama-text-embed-v2",
        fallback: "none",
        remote: {
          baseUrl: "http://pineconeadapter:11434/v1",
          apiKey: "env:PINECONEADAPTER_API_KEY"
        }
      }
    }
  }
}
#

Important bit

Your current:

models.providers.pineconeadapter

does not automatically create a memory embedding provider named pineconeadapter.

That config is for model/provider transport config, but memory embeddings use a separate adapter registry.

If you really want provider: "pineconeadapter"

Then you need a plugin that registers a memory embedding provider adapter for that ID. Config alone is not enough.

Quick verify steps

north fog
#

It's working now with the agents.defaults.memorySearch.remote settings

node@ed58f547b964:/app$ openclaw memory status

🩞 OpenClaw 2026.4.16 (unknown)
   Type the command with confidence—nature will provide the stack trace if needed.

Memory Search (main)
Provider: openai (requested: openai)
Model: llama-text-embed-v2
Sources: memory, sessions
Indexed: 264/553 files · 2763 chunks
Dirty: no
Store: ~/.openclaw/memory/main.sqlite
Workspace: ~/.openclaw/workspace
Dreaming: off
By source:
  memory · 235/461 files · 1443 chunks
  sessions · 29/92 files · 1320 chunks
Vector: ready
Vector dims: 2048
Vector path: /app/node_modules/sqlite-vec-linux-x64/vec0.so
FTS: ready
Embedding cache: enabled (2528 entries)
Cache cap: 50000
Batch: disabled (failures 0/2)
Recall store: 0 entries · 0 promoted · 0 concept-tagged · 0 spaced
Recall path: ~/.openclaw/workspace/memory/.dreams/short-term-recall.json
Dreaming artifacts: diary absent · 0 corpus files · ingestion state absent
Dream corpus: ~/.openclaw/workspace/memory/.dreams/session-corpus
Dream ingestion: ~/.openclaw/workspace/memory/.dreams/session-ingestion.json
elfin aspenBOT
#

Nice, glad that did it 🎉

So the winning setup is:

  • memorySearch.provider: "openai"
  • custom endpoint under agents.defaults.memorySearch.remote
  • your adapter serves an OpenAI-compatible /v1/embeddings

That status output looks healthy: vector ready, 2048 dims, cache working, indexing happening.

Tiny nuance for future readers: models.providers.<custom-id> alone does not make a memory embedding provider. For memory search, the custom endpoint needs to go under memorySearch.remote unless you ship a plugin that registers a real embedding adapter.

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it for the next person.

north fog
#

openclaw memory status --deep reported this

Embeddings: unavailable
Embeddings error: openai embeddings failed: 401 {"error":"Unauthorized"}

How do I set the auth-key?

elfin aspenBOT
#

Ah, I think I see it.

A 401 Unauthorized here usually means the endpoint is reachable, but the key being sent is wrong.

And in your config, this is the likely culprit:

apiKey: "env:PINECONEADAPTER_API_KEY"

For memorySearch.remote.apiKey, that is not the env-ref format OpenClaw resolves. It will likely be treated as the literal string env:PINECONEADAPTER_API_KEY, which would explain the 401.

Use one of these instead

Best, explicit SecretRef

#
remote: {
  baseUrl: "http://pineconeadapter:11434/v1",
  apiKey: {
    source: "env",
    provider: "default",
    id: "PINECONEADAPTER_API_KEY"
  }
}

Or env-template shorthand

remote: {
  baseUrl: "http://pineconeadapter:11434/v1",
  apiKey: "${PINECONEADAPTER_API_KEY}"
#
}

Or just hardcode it temporarily to verify

remote: {
  baseUrl: "http://pineconeadapter:11434/v1",
  apiKey: "actual-key-here"
}

Important nuance

That env var must exist in the OpenClaw process/container environment, not just your interactive shell.

So if you are running in Docker, make sure the container actually has:

#
PINECONEADAPTER_API_KEY=...

One more thing

OpenClaw sends auth as:

Authorization: Bearer <apiKey>

If your adapter expects a different header, use:

remote: {
north fog
elfin aspenBOT
#

Perfect, nice catch 🎉

So the final fix was:

remote: {
  baseUrl: "http://pineconeadapter:11434/v1",
  apiKey: {
    source: "env",
    provider: "default",
    id: "PINECONEADAPTER_API_KEY"
  }
}

So yeah, the issue was the auth ref format, not Pinecone itself.

#

If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution so Answer Overflow can index it for the next person.

north fog
#

I have generous rate limits on my embedding provider, but I occationally get rate limit:ed.

It seems that the indexing of the files starts from scratch if I do the command "openclaw memory index"

https://pastebin.com/VPKYSmAM

Seems the indexed count remain the same, eventhough the "openclaw memory index" successfully indexed 50 files

elfin aspenBOT
#

Let me verify whether memory index is supposed to be atomic here, because the unchanged counts after a failed run smell intentional, not random.

#

Yeah, this looks intentional, not random.

I checked the memory-core path, and when a full reindex is needed, openclaw memory index does a safe atomic rebuild:

  • it builds a temporary SQLite index
  • only swaps it into place after the whole run succeeds
  • if embeddings hit 429 mid-run, it keeps the old index and throws away the temp one

So the behavior you saw makes sense:

  • progress can say it indexed ~50 files
  • then a 429 happens
  • memory status still shows the old committed counts
  • next run can look like it is starting over

Important nuance

#

The embedding cache is seeded from the old committed DB, so already-committed embeddings can be reused.

But embeddings fetched during the failed temp rebuild are lost with that temp DB, so there is no checkpoint resume for the failed part right now.

Rate limit mitigation options

  1. If your adapter can support OpenAI batch embeddings, try:
    remote: {
      batch: {
        enabled: true,
        concurrency: 1,
        wait: true
      }
    }
    
    Docs say batch indexing exists for openai, gemini, and voyage.
#
  1. Backfill in smaller chunks

    • temporarily disable "sessions" as a source, or
    • reduce indexed paths / extra paths,
    • let memory finish, then add more
  2. Feature gap
    I do not see a documented knob for lowering the normal non-batch embedding concurrency. The non-batch path is fixed in code, so if resumable reindex or lower non-batch concurrency would help, that is worth a GitHub issue:
    https://github.com/openclaw/openclaw/issues

Docs:

If you want, I can help sketch the best config for your adapter, either “slow and reliable” or “fastest possible without tripping 429s”.

north fog
#

Indexed: 265/619 files · 2942 chunks
I thought OpenClaw would just index the files incrementally untill all files are indexed and then just index new files, that's the most sane implementation

elfin aspenBOT
#

Yeah, you’re not wrong. For a rate-limited remote embedder, eventual incremental catch-up would be the saner UX.

What OpenClaw does today is a bit different:

  • it wants a consistent full snapshot for a given provider/model/config
  • so when a full reindex is required, it rebuilds in a temp DB
  • partial progress is not committed
  • only after that full build succeeds does it go back to mostly dirty-file incremental updates

So the current behavior is basically:

  • after a successful baseline for the current config, incremental updates work as you’d expect
  • before that baseline succeeds, retries do not march 265 → 266 → 267 ... across runs

That is why it feels wrong here.

Also, a full reindex gets triggered by things like:

#
  • provider change
  • embedding model change
  • source set change (memory / sessions)
  • chunking/scope changes

So if you changed any of those recently, OpenClaw is trying to establish a fresh baseline, not continue a partial one.

Honestly, I think “atomic full rebuild + no resume” is defensible for correctness, but pretty rough for remote 429-prone providers. This is worth a GitHub issue as a feature request for resumable baseline indexing / lower concurrency:
https://github.com/openclaw/openclaw/issues

Practical workaround for now:

  1. index only memory first
  2. let that finish
  3. then re-enable sessions

If you want, I can help you write the exact minimal config for that staged rollout.

north fog
#

I'm using mistral and it seems to support batching, how do I trigger the indexing, is it just "openclaw memory index --force" ?

elfin aspenBOT
#

Yep, basically.

If you want batch indexing, first enable it in memorySearch.remote, for example:

remote: {
  baseUrl: "http://pineconeadapter:11434/v1",
  apiKey: {
    source: "env",
    provider: "default",
    id: "PINECONEADAPTER_API_KEY"
  },
  batch: {
    enabled: true,
    concurrency: 1,
    wait: true
#
  }
}

Then trigger the rebuild with:

openclaw memory index --force --verbose

A couple of nuances:

  • --force is the safest way to make it rebuild using the new batch settings.
  • --verbose helps confirm it is actually taking the batch path.
  • I’d start with concurrency: 1 since you were hitting rate limits.
#

After that, check:

openclaw memory status

and look for the Batch: line.

One important caveat: OpenClaw’s docs say batch indexing is available for the openai / gemini / voyage embedding paths. So this works if your current setup is effectively using the openai provider path against a batch-capable endpoint. If your server only supports Mistral-native batching and not the OpenAI-style batch flow, OpenClaw will not magically know that.

Docs: https://docs.openclaw.ai/reference/memory-config#batch-indexing

If you want, paste your final memorySearch block and I’ll sanity-check the batch config before you kick off the force reindex.

north fog
#

markus@staropramen /data/code/docker/openclaw (git)-[main] % jq -r .agents.defaults.memorySearch ~/.openclaw/openclaw.json
{
"sources": [
"memory",
"sessions"
],
"experimental": {
"sessionMemory": true
},
"provider": "openai",
"remote": {
"baseUrl": "https://omniroute.vhost.harre.dynv6.net/v1",
"apiKey": {
"source": "env",
"provider": "default",
"id": "OMNIROUTE_API_KEY"
},
"batch": {
"enabled": true,
"concurrency": 1,
"wait": true
}
},
"fallback": "none",
"model": "mistral/mistral-embed",
"query": {
"hybrid": {
"enabled": true,
"temporalDecay": {
"enabled": true,
"halfLifeDays": 30
}
}
},
"cache": {
"enabled": true,
"maxEntries": 50000
}
}

Batching seemed to fail too 🙁
https://pastebin.com/vKvNS4vP

elfin aspenBOT
#

Yep, the batch trigger did happen.

The useful line is this one:

openai batch file upload failed: 404 Server action not found

So what happened was:

  1. openclaw memory index --force --verbose correctly tried batch mode
  2. OpenClaw’s openai batch path tried the OpenAI-style batch flow
  3. your endpoint returned 404 on the batch file upload step
  4. OpenClaw disabled batch after 2 failures, then fell back to normal embeddings
  5. normal embeddings still hit 429

So the problem is not the command. The problem is that your endpoint does not support the OpenAI batch API shape OpenClaw expects.

#

What OpenClaw is expecting for openai batch

At minimum, the endpoint needs the OpenAI-style batch flow, including things like:

  • POST /v1/files with purpose=batch
  • POST /v1/batches
  • batch status/output retrieval

Your log shows it already died on /v1/files.

What this means for your setup

Because you are using:

provider: "openai"
#
model: "mistral/mistral-embed"

OpenClaw is using the OpenAI provider path, even though the model behind it is Mistral.

So “Mistral supports batching” is only useful here if your adapter exposes that support through the OpenAI-compatible batch endpoints OpenClaw expects.

Recommendation

For now, I’d do one of these:

Option 1, recommended now

Turn batch back off for this endpoint:

#
remote: {
  batch: {
    enabled: false
  }
}

and use the staged rollout workaround (memory first, then sessions).

Option 2

Add OpenAI-compatible batch support to your adapter/proxy.

Option 3

If you want native Mistral batch support in memory indexing, that sounds like a feature request / enhancement:
https://github.com/openclaw/openclaw/issues

#

So short version: yes, you used the right command, batch really was triggered, but your server does not implement the batch API OpenClaw is trying to call.

If you want, I can help you infer the exact missing endpoints your adapter needs from the current OpenClaw batch flow.

north fog
elfin aspenBOT
#

Yep, exactly.

That’s the mismatch.

OpenClaw is currently on the openai memory batch path, so it expects the OpenAI batch contract. Mistral’s native batch API is different, so OpenClaw will not automatically use it just because the underlying model is Mistral.

So right now you have two realistic options:

  1. Adapter/proxy translation layer
    Make your endpoint expose the OpenAI-style batch endpoints OpenClaw expects, and translate those to Mistral’s batch API internally.

  2. Native OpenClaw support for Mistral batch
    That would need a feature/addition in OpenClaw itself, not just config:
    https://github.com/openclaw/openclaw/issues

So your diagnosis is right: Mistral batching exists, but it is not the same API that OpenClaw’s current openai batch mode is calling.

#

If you control the adapter, the pragmatic path is probably:

  • keep non-batch mode for now, or stage indexing
  • later add an OpenAI-batch compatibility shim in front of Mistral batch

If you want, I can help sketch what that shim would need to implement.

north fog
#

I'll think about it later, but now it seems that id doesn't die during the indexing.

It just keeps getting rate-limited but then continues the process without hard-fault

18:18:40 [memory] embeddings: batch start
18:18:40 [memory] embeddings rate limited; retrying in 547ms
18:18:41 [memory] embeddings: batch start
18:18:41 [memory] embeddings: batch start
18:18:41 [memory] embeddings rate limited; retrying in 570ms
18:18:42 [memory] embeddings: batch start
18:18:43 [memory] embeddings: batch start
18:18:43 [memory] embeddings: batch start
18:18:43 [memory] embeddings rate limited; retrying in 578ms
18:18:44 [memory] embeddings: batch start
Indexing memory files (batch)... 153/521 · elapsed 6:23 · eta 15:21 29%

elfin aspenBOT
#

Yeah, that’s a much better sign.

That means the retry/backoff path is actually doing its job now:

  • it hits 429
  • waits a bit
  • retries
  • keeps moving instead of hard-failing immediately

So I’d let that run.

The important milestone is just getting one full successful baseline build for this config. After that, the day-to-day behavior should be a lot less painful because OpenClaw can stay on incremental updates instead of constantly trying to rebuild from zero.

If it does eventually hard-fault again, the interesting bit will be the final error after the last retry, not the intermediate rate limited; retrying ... lines.

north fog
#

18:22:28 [memory] embeddings: batch start
18:22:29 [memory] embeddings: batch start
18:22:29 [memory] embeddings: batch start
18:22:29 [memory] embeddings rate limited; retrying in 520ms
18:22:30 [memory] embeddings: batch start
18:22:30 [memory] embeddings: batch start
18:22:31 [memory] embeddings rate limited; retrying in 533ms
18:22:31 [memory] embeddings: batch start
18:22:33 [memory] embeddings: batch start
18:22:33 [memory] embeddings: batch start
18:22:33 [memory] embeddings rate limited; retrying in 500ms
18:22:34 [memory] embeddings: batch start
18:22:34 [memory] embeddings rate limited; retrying in 1151ms
18:22:35 [memory] embeddings: batch start
18:22:36 [memory] embeddings rate limited; retrying in 2348ms
18:22:38 [memory] embeddings: batch start
Memory index failed (main): openai embeddings failed: 429 {"error":{"object":"error","message":"Rate limit exceeded","type":"rate_limited","param":null,"code":"1300","raw_status_code":429}}
node@f2d0dedcbf4a:/app$

elfin aspenBOT
#

Ah, damn, yeah, that means it eventually exhausted the retry budget and then hard-failed the whole reindex.

So the behavior is:

  • short bursts of 429 are tolerated
  • OpenClaw backs off and retries
  • but if one embedding batch keeps getting 429 after the built-in retries, the full index run dies

Given your setup, I think the pragmatic options are now:

  1. Throttle/queue in your adapter

    • probably the best fix, honestly
    • smooth the burst from OpenClaw so your upstream/provider stops returning hard 429s
  2. Stage the rollout

    • index only memory first
    • then re-enable sessions
#
  1. Feature gap in OpenClaw
    • resumable full reindex
    • lower non-batch embedding concurrency
    • native Mistral batch support

That combination is worth a GitHub issue:
https://github.com/openclaw/openclaw/issues

Since you control the adapter, I’d lean toward adapter-side queueing/backpressure first. Right now OpenClaw is basically saying, “I can retry a bit, but I won’t babysit a long series of 429s forever.”

If you want, I can help you think through the cleanest adapter behavior here, like queue vs token bucket vs delayed 200 response.

north fog
#

It seems when the retry goes over 2348ms it fails, can I set a higher max time?

elfin aspenBOT
#

Not via config, at least not currently.

I checked the memory-core source, and those retry values are hardcoded right now:

  • base delay: 500ms
  • max attempts: 3
  • max backoff cap: 8000ms

So the 2348ms is not the failure threshold by itself. It is just the last jittered backoff you happened to see before the request finally gave up.

What is actually happening is:

  • request gets 429
  • retry loop backs off a few times
  • if it is still 429 after the allowed attempts, the index run fails

So if you want more retries / longer waits, I do not see a user-facing config knob for that in current OpenClaw.

#

Also, remote.batch.timeoutMinutes is a different thing. That is for waiting on async batch jobs, not these per-request 429 retries.

So the practical answer is:

  • No, not from config today
  • if you need that behavior, it needs either
    • adapter-side throttling/queueing, or
    • a code change / feature request in OpenClaw

GitHub issue path: https://github.com/openclaw/openclaw/issues

If you want, I can help you phrase the exact feature request, something like: “make memory embedding retry/backoff configurable for remote providers.”

north fog
#

yeah, help me write the issue

elfin aspenBOT
#

Yep, here’s a clean draft you can paste into a GitHub issue:

Title
Configurable retry/backoff and concurrency for remote memory embedding indexing

Body

## Summary

When `memorySearch` uses a remote embedding endpoint, `openclaw memory index` can fail on `429` rate limits even though the endpoint would succeed if OpenClaw backed off longer and/or used lower concurrency.

I could not find a user-facing way to tune the retry/backoff behavior for memory embedding requests, and I also could not find a knob for lowering the normal non-batch embedding concurrency.

This is especially painful because full reindexes are atomic, so if the run eventually fails, the previous index is kept and the temp rebuild progress is discarded.

## Version
#

Observed on:

- OpenClaw `2026.4.19-beta.2`

## Config

Sanitized config:

```json5
{
  agents: {
    defaults: {
      memorySearch: {
        sources: ["memory", "sessions"],
        experimental: {
#

sessionMemory: true
},
provider: "openai",
model: "mistral/mistral-embed",
remote: {
baseUrl: "https://<custom-endpoint>/v1",
apiKey: {
source: "env",
provider: "default",
id: "OMNIROUTE_API_KEY"
}
},
fallback: "none",
cache: {
enabled: true,
maxEntries: 50000
}

#

}
}
}
}


## Reproduction

Run:

```bash
openclaw memory index --force --verbose

Against a remote embedding endpoint that occasionally returns 429 under bursty indexing load.

#
## Actual behavior

OpenClaw retries a few times, but eventually aborts the whole index run with `429`.

Example logs:

```text
18:22:29 [memory] embeddings rate limited; retrying in 520ms
18:22:31 [memory] embeddings rate limited; retrying in 533ms
18:22:33 [memory] embeddings rate limited; retrying in 500ms
18:22:34 [memory] embeddings rate limited; retrying in 1151ms
18:22:36 [memory] embeddings rate limited; retrying in 2348ms
18:22:38 [memory] embeddings: batch start
Memory index failed (main): openai embeddings failed: 429 {"error":{"object":"error","message":"Rate limit exceeded","type":"rate_limited","param":null,"code":"1300","raw_status_code":429}}
#

## Expected behavior

At least one of these:

1. Expose configurable retry/backoff for remote memory embedding requests
2. Expose configurable non-batch embedding concurrency
3. Offer a slower / conservative indexing mode for rate-limited remote providers

## Why this matters

Custom/self-hosted embedding adapters often have stricter upstream limits than the default hosted providers.

The current fixed retry budget is sometimes too small, and because full reindex is atomic, a single exhausted retry can kill the whole run.
#
## Possible config ideas

Something like:

```json5
memorySearch: {
  remote: {
    retry: {
      maxAttempts: 8,
      baseDelayMs: 1000,
      maxDelayMs: 30000
    },
    concurrency: 1
  }
}
#

The exact shape is not important, but some user-facing control here would make remote embedding backfills much more practical.

If you want, I can also make this a bit more “bug report” or a bit more “feature request”, depending on how you want to frame it.

elfin aspenBOT
#

Nice, that’s the right move.

Issue and PR, even better. If maintainers ask for follow-up, the most useful extra context is probably:

  • the memorySearch config shape
  • the 429 retry/fail log snippet
  • the note that full reindex is atomic, so failed runs discard temp progress

Looks like you’ve already done the hard part though. Nice work.

north fog
#

Give me a git patch to change to 10 maxAttempts and maxDelay 60000

elfin aspenBOT
#
diff --git a/extensions/memory-core/src/memory/manager-embedding-ops.ts b/extensions/memory-core/src/memory/manager-embedding-ops.ts
index 5e4a8c7..9a2b4f1 100644
--- a/extensions/memory-core/src/memory/manager-embedding-ops.ts
+++ b/extensions/memory-core/src/memory/manager-embedding-ops.ts
@@ -44,9 +44,9 @@ const VECTOR_TABLE = "chunks_vec";
 const FTS_TABLE = "chunks_fts";
 const EMBEDDING_CACHE_TABLE = "embedding_cache";
 const EMBEDDING_BATCH_MAX_TOKENS = 8000;
 const EMBEDDING_INDEX_CONCURRENCY = 4;
-const EMBEDDING_RETRY_MAX_ATTEMPTS = 3;
+const EMBEDDING_RETRY_MAX_ATTEMPTS = 10;
 const EMBEDDING_RETRY_BASE_DELAY_MS = 500;
-const EMBEDDING_RETRY_MAX_DELAY_MS = 8000;
+const EMBEDDING_RETRY_MAX_DELAY_MS = 60000;
 const EMBEDDING_QUERY_TIMEOUT_REMOTE_MS = 60_000;
#
 const EMBEDDING_QUERY_TIMEOUT_LOCAL_MS = 5 * 60_000;
 const EMBEDDING_BATCH_TIMEOUT_REMOTE_MS = 2 * 60_000;
north fog
#

Ah, it's that easy, I'll rebuilding my container now and try it again

elfin aspenBOT
#

Yep, pretty small change.

Just one caution, it should make the indexer more patient, but it won’t fix the underlying rate limit behavior by itself. If the provider keeps returning 429 long enough, it can still eventually fail, just later.

Still, I’m glad this is easy to test. Hope the rebuild does the trick đŸ€ž

north fog
#

It almost succeeded, on the last few files I got this error

19:57:49 [memory] embeddings: batch start
19:57:49 [memory] embeddings rate limited; retrying in 2276ms
19:57:52 [memory] embeddings: batch start
Memory index failed (main): openai embeddings failed: 400 {"error":{"object":"error","message":"Invalid JSON payload","type":"invalid_request_error","param":null,"code":null,"raw_status_code":400}}

elfin aspenBOT
#

Oof, that’s frustrating, but this is actually a different class of failure now.

429 was retry/backoff territory.
This 400 Invalid JSON payload means one request body got rejected outright, so OpenClaw will not retry it.

Most likely possibilities:

  • one specific file/chunk has content your adapter/upstream parser chokes on
  • your adapter has a JSON parsing bug on some escaped content
  • less likely, one request shape/size edge case is being misreported as “invalid JSON”

What I’d do next

  1. Run with --verbose and watch the last file/path before the 400
    openclaw memory index --force --verbose
    
#
  1. Split sources to isolate it

    • try memory only
    • then sessions only

    If one succeeds and the other fails, you’ve narrowed it fast.

  2. Log the raw request body in your adapter on 400
    Since this is your custom endpoint, that is probably the fastest truth source.

Practical workaround

Since you were almost done, I’d probably finish the easier baseline first:

  • temporarily disable sessions
  • get memory indexed successfully
  • then debug the bad session/file later
#

So yeah, good news and bad news:

  • good news, the longer retry patch clearly helped
  • bad news, there is probably one offending payload/file left to isolate

If you want, I can help you turn your current config into a quick “memory only first, sessions later” rollout.

north fog
#

I did --verbose already, how to try memory only

elfin aspenBOT
#

Easiest way is to temporarily change:

agents.defaults.memorySearch.sources

from:

["memory", "sessions"]

to:

["memory"]
#

Then run:

openclaw memory index --force --verbose

You can leave this as-is:

experimental: {
  sessionMemory: true
}

The important part is just that sources only contains "memory" for this test.

Then later, to test sessions separately, switch it to:

#
["sessions"]

or back to:

["memory", "sessions"]

So for this exact check, you want:

memorySearch: {
  sources: ["memory"],
  experimental: {
#
    sessionMemory: true
  },
  provider: "openai",
  remote: { ... },
  model: "mistral/mistral-embed",
  ...
}

If you want, paste the whole updated memorySearch block and I’ll sanity-check it before you rerun.

north fog
#

Is there any background job that does the index? Because I see now in the ordinary openclaw log in docker this output
claw | 2026-04-19T20:21:38.760+00:00 [memory] embeddings rate limited; retrying in 551ms
claw | 2026-04-19T20:21:40.251+00:00 [memory] embeddings rate limited; retrying in 552ms
claw | 2026-04-19T20:21:43.040+00:00 [memory] embeddings rate limited; retrying in 578ms
claw | 2026-04-19T20:21:44.720+00:00 [memory] embeddings rate limited; retrying in 520ms

elfin aspenBOT
#

Yes, there can be background indexing activity, but it is not a separate “cron job” in the simple sense.

For builtin memory:

  • memory file changes trigger a debounced reindex
  • session indexing is async when enabled
  • and that work runs inside the normal long-running OpenClaw process, so you’ll see it in the regular Docker logs

In your case, this is the likely reason:

sources: ["memory", "sessions"],
experimental: {
  sessionMemory: true
}
#

So with sessions enabled, OpenClaw can be indexing transcripts in the background.

Since your status showed Dreaming: off, it is probably not dreaming-related background work. More likely it is normal memory/session sync.

If you want to quiet this down for testing, the cleanest temporary setup is:

sources: ["memory"]

and rerun after restart / reload. Then if those background memory retry lines stop, you’ve confirmed it was session indexing.

north fog
#

This started when removing session from sources

elfin aspenBOT
#

Ah, yep, that fits.

Changing:

sources: ["memory", "sessions"]

to:

sources: ["memory"]

changes the memory index scope, so OpenClaw will auto-reindex in the running process.

So those background log lines are probably the gateway noticing:

#
  • source set changed
  • old index contains session data
  • it needs to rebuild the index for the new scope

So in this case, the background indexing is very likely because of the config change itself, not because sessions are still being indexed.

That also means your “memory only” test can get polluted by the auto-rebuild from the running gateway if you are editing config while the container is live.

If you want a cleaner test, the least confusing flow is:

  1. change config to sources: ["memory"]
  2. restart the container / gateway
  3. then run openclaw memory index --force --verbose

That way you know exactly which process is doing the indexing.

north fog
#

When I revert back to
sources: ["memory", "sessions"]

will it invalidate all the indexing done for memory and start from scratch?

elfin aspenBOT
#

Sort of, but not in the worst possible way.

Switching back to:

sources: ["memory", "sessions"]

does trigger a full reindex, because the index scope changed.

So yes, OpenClaw will rebuild the index.

But if:

  • the provider stays the same
  • the model stays the same
  • and cache.enabled: true
#

then the already-committed memory embeddings can be reused from the embedding cache during that rebuild.

So the practical behavior should be:

  • index gets rebuilt
  • memory rows are not necessarily re-embedded from zero
  • session chunks still need fresh embeddings

So I’d think of it as:

  • logical full rebuild: yes
  • throw away all prior memory embedding work and redo it all: not necessarily, assuming the memory-only run completed successfully and got committed first

That successful commit part matters. If the memory-only run fails before commit, the temp work is lost.

north fog
#

alright, I'll let it try embed all then and work on getting batch support implemented in omniroute

elfin aspenBOT
#

That sounds like the right call.

Let the current run squeeze through if it can, then proper batch support in omniroute is probably the real long-term fix here.

You’ve already done the annoying detective work, which is honestly the hard part. If you get stuck on the OpenAI-batch compatibility layer later, feel free to post the endpoint shape you’re planning and I can sanity-check it.

north fog
#

I've added so OmniRoute implement OpenAI batching, how do I control how many items it sends per batch?

elfin aspenBOT
#

Not via config today.

What OpenClaw currently controls in config is only:

  • remote.batch.enabled
  • remote.batch.concurrency
  • remote.batch.wait
  • remote.batch.pollIntervalMs
  • remote.batch.timeoutMinutes

But items per batch is not exposed.

Right now the batching logic is:

  • pack chunks until a hardcoded budget is reached
  • the hardcoded budget is EMBEDDING_BATCH_MAX_TOKENS = 8000
  • so it is effectively size-based, not N-items-based
#

So the short answer is:

  • Can you directly say “send 10 items per batch”? No
  • Can you reduce parallel jobs? Yes, with remote.batch.concurrency
  • Can you change the batch packing limit? Only with a code patch right now

If you want, I can give you a tiny patch to lower the batch packing limit from 8000 to something smaller for your OmniRoute testing.

north fog
#

Is it possible to disable non-batched fallback if the batching fails?

elfin aspenBOT
#

No — I don’t see a config switch to disable that today.

memorySearch.fallback: "none" only disables provider fallback. It does not disable the batch → non-batch embeddings fallback. If batch embedding fails, memory-core currently falls back to inline/non-batch embeddings; after 2 batch failures it disables batch for that manager and keeps indexing inline.

For your config, two related notes:

memorySearch: {
  provider: "openai",
  fallback: "none",
  model: "llama-text-embed-v2",
  remote: {
    baseUrl: "http://pineconeadapter:11434/v1",
    apiKey: "env:PINECONEADAPTER_API_KEY",
    batch: {
      enabled: true
#
    },
    nonBatchConcurrency: 1
  }
}
  • provider: "pineconeadapter" errors because pineconeadapter isn’t a registered memory embedding provider. If you keep a custom models.providers.pineconeadapter, its api needs to map to a real embedding adapter like "openai", not "openai-completions".
  • Only enable remote.batch.enabled if your adapter implements the OpenAI Batch API (/files, /batches, etc.). If it only implements OpenAI-compatible /v1/embeddings, leave remote.batch.enabled: false and tune remote.nonBatchConcurrency.

Strict “batch or fail, never inline fallback” would be a feature request today: https://github.com/openclaw/openclaw/issues

north fog
#

I've done a wireshark capture and my endpoint returns this when OpenClaw tries to create the batch, is there any issues?

GET /v1/batches/batch_c890031c5405491b93c43198 HTTP/1.1
host: 192.168.1.244:20128
connection: keep-alive
Content-Type: application/json
Authorization: Bearer XXXX
accept: /
accept-language: *
sec-fetch-mode: cors
user-agent: undici
accept-encoding: gzip, deflate

HTTP/1.1 200 OK
vary: rsc, next-router-state-tree, next-router-prefetch, next-router-segment-prefetch
access-control-allow-headers: Content-Type, Authorization, x-api-key, anthropic-version, x-omniroute-connection, x-internal-test, accept
access-control-allow-methods: GET, POST, PUT, DELETE, OPTIONS
access-control-allow-origin: *
content-type: application/json
Date: Tue, 28 Apr 2026 22:00:18 GMT
Connection: keep-alive
Keep-Alive: timeout=5
Transfer-Encoding: chunked

{"id":"batch_c890031c5405491b93c43198","object":"batch","endpoint":"/v1/embeddings","errors":null,"input_file_id":"file-451596acfec94211894ff533","completion_window":"24h","status":"in_progress","output_file_id":null,"error_file_id":null,"created_at":1777413610,"in_progress_at":1777413618,"expires_at":null,"finalizing_at":null,"completed_at":null,"failed_at":null,"expired_at":null,"cancelling_at":null,"cancelled_at":null,"request_counts":{"total":1,"completed":0,"failed":0},"metadata":{"source":"openclaw-memory","agent":"main"},"model":null,"usage":null}

elfin aspenBOT
#

That capture looks fine as an intermediate batch-status response.

OpenClaw treats:

"status": "in_progress",
"output_file_id": null

as “keep polling” as long as remote.batch.wait is true. The important part is that it eventually returns:

{
  "status": "completed",
  "output_file_id": "file-..."
}
#

Then OpenClaw will call:

GET /v1/files/{output_file_id}/content

and expects OpenAI Batch-style JSONL, roughly:

{"custom_id":"0","response":{"status_code":200,"body":{"data":[{"embedding":[0.1,0.2]}]}}}

Things that would break:

  • batch stays in_progress until remote.batch.timeoutMinutes
  • final completed response has no output_file_id
  • output JSONL is missing the same custom_ids OpenClaw submitted
#
  • output body doesn’t contain response.body.data[0].embedding
  • remote.batch.wait is false

Also: your earlier config still needs the provider mapping fixed. For a custom OpenAI-compatible embedding/batch endpoint, use the OpenAI memory adapter owner, not openai-completions:

models: {
  providers: {
    pineconeadapter: {
      api: "openai",
      baseUrl: "http://pineconeadapter:11434/v1",
      apiKey: "env:PINECONEADAPTER_API_KEY",
      models: [{ id: "llama-text-embed-v2" }],
    },
  },
},