#whatsapp media

1 messages Β· Page 1 of 1 (latest)

tame notch
#

I have connected a dedicated to whatsapp channel. I have my api to upload image and then another api after that. I have built mcp for both api with StdioServerTransport. How to add these and call them when user uploads a media on whatsapp

bold karmaBOT
hidden bronzeBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> βœ… Mark Solution

obtuse dirgeBOT
#

Let me check the hook/trigger path too, because that’s the deciding bit here: whether this can be truly automatic on inbound WhatsApp media, or only agent-driven during a normal turn.

#

Short version: add both MCP servers under mcp.servers, restart the gateway, then choose one of these flows:

1. Agent-driven flow (simplest if the LLM can decide)

Register your stdio MCP servers:

openclaw mcp set upload-image '{"command":"node","args":["/absolute/path/to/upload-image-mcp.js"]}'
openclaw mcp set second-api '{"command":"node","args":["/absolute/path/to/second-api-mcp.js"]}'
openclaw mcp list
openclaw mcp show upload-image --json

That only saves config. On embedded Pi runs, OpenClaw loads configured mcp.servers and exposes those MCP tools to the agent.

Then add an instruction in your agent prompt / AGENTS.md, for example:

When a WhatsApp message contains image media, first call the upload-image MCP tool, then call the second-api MCP tool with the returned id/url, then reply.

#

For WhatsApp, inbound media is saved and passed into the turn. Media-only bodies become placeholders like <media:image>, and the inbound file path is carried in the media context (MediaPath / MediaPaths, sandboxed as media/inbound/...).

2. Fully automatic flow on every media upload (recommended if this must always happen)

MCP config by itself is not the trigger.

If you want β€œuser uploads image on WhatsApp -> always run API 1 -> then API 2”, use a hook/plugin, typically message:preprocessed for media. That is OpenClaw’s event path.

Important bit:

  • MCP servers = tools available to the agent
  • Hooks = automatic event trigger

So for deterministic processing, make a message:preprocessed hook and call your APIs directly there (cleanest), or build a plugin that invokes your MCP logic.

Recommendation

  • If β€œusually do this” is fine, use MCP + prompt instructions.
  • If β€œmust run every single time media arrives” is required, use a hook/plugin.
tame notch
#

I want below configuration to be added

{
"mcpServers": {
"aayuvyaa-product-draft": {
"command": "node",
"args": ["/opt/store-ops/product-draft-mcp/src/server.js"],
"cwd": "/opt/store-ops/product-draft-mcp",
"env": {
"API_BASE_URL": "http://127.0.0.1:8080",
"REQUEST_TIMEOUT_MS": "15000"
}
}
}
}

where to add?

obtuse dirgeBOT
#

Add it to your OpenClaw config as mcp.servers, not as top-level mcpServers.

Correct OpenClaw shape

Usually in ~/.openclaw/openclaw.json:

{
  mcp: {
    servers: {
      "aayuvyaa-product-draft": {
        command: "node",
        args: ["/opt/store-ops/product-draft-mcp/src/server.js"],
        cwd: "/opt/store-ops/product-draft-mcp",
        env: {
          API_BASE_URL: "http://127.0.0.1:8080",
          REQUEST_TIMEOUT_MS: "15000",
#
        },
      },
    },
  },
}

Easier way, via CLI

openclaw mcp set aayuvyaa-product-draft '{"command":"node","args":["/opt/store-ops/product-draft-mcp/src/server.js"],"cwd":"/opt/store-ops/product-draft-mcp","env":{"API_BASE_URL":"http://127.0.0.1:8080","REQUEST_TIMEOUT_MS":"15000"}}'

Then verify:

#
openclaw mcp list
openclaw mcp show aayuvyaa-product-draft --json

Then restart the gateway.

Important

Your snippet with:

{
  "mcpServers": { ... }
}

is the bundle / MCP-client style, not the normal OpenClaw root config style.

#

For normal OpenClaw config, use:

  • mcp.servers

Docs: https://docs.openclaw.ai/cli/mcp

If you want, I can also rewrite your second MCP server into the exact same OpenClaw format.

tame notch
#

added the mcp

openclaw mcp show aayuvyaa-product-draft --json
{
"command": "node",
"args": [
"/home/atish/product-draft-mcp/src/server.js"
],
"env": {
"API_BASE_URL": "https://services.aayuvyaa.in",
"REQUEST_TIMEOUT_MS": "15000"
},
"cwd": "/home/atish/product-draft-mcp"
}

now How OpenClaw should call it

Tool name:

create_product_draft
Typical payload:

{
"storeId": 1,
"sourceMessageId": "wa-12345",
"senderPhone": "+919999999999",
"receivedAt": "2026-04-17T12:45:00Z",
"name": "Silver Ring",
"description": "Adjustable ring",
"sellingPrice": 2499,
"purchasePrice": 1800,
"quantity": 2,
"categoryHint": "Rings",
"confidence": 0.91,
"images": [
{
"mediaUrl": "https://.../image.jpg",
"sortOrder": 0,
"primary": true,
"approved": true
}
]
}
Or with inline image bytes:

{
"storeId": 1,
"sourceMessageId": "wa-12345",
"senderPhone": "+919999999999",
"receivedAt": "2026-04-17T12:45:00Z",
"name": "Silver Ring",
"images": [
{
"base64": "<base64-image>",
"mimeType": "image/jpeg",
"fileName": "ring.jpg",
"sortOrder": 0,
"primary": true,
"approved": true
}
]
}

on whatsapp image upload

obtuse dirgeBOT
#

OpenClaw will not call create_product_draft automatically just because the MCP is registered.

You need to choose one of these:

Recommended for your case

Use 2 tools in sequence inside the agent turn:

  1. upload-image MCP

    • input: local inbound image path (MediaPath)
    • output: hosted mediaUrl
  2. create_product_draft MCP

    • input: your JSON payload
    • use the mediaUrl from step 1

I’d strongly recommend the mediaUrl flow, not inline base64. It’s simpler, cheaper, and more reliable.

#

How OpenClaw should call it

Add an instruction to your agent prompt / AGENTS.md like this:

When a WhatsApp message includes image media:

1. Read the inbound WhatsApp image from the provided media path.
2. Call the image-upload MCP tool first and get a public mediaUrl.
3. Then call `create_product_draft`.
4. Build the payload like this:
   - `storeId`: 1
   - `sourceMessageId`: inbound WhatsApp message id if available
   - `senderPhone`: inbound sender phone if available
#
   - `receivedAt`: inbound message timestamp in ISO-8601 UTC if available
   - `name`, `description`, `categoryHint`, `confidence`: infer from the image and caption
   - `images`: [{ mediaUrl, sortOrder: 0, primary: true, approved: true }]
5. If some metadata is missing, still create the draft with the fields you do have.

Important limitation

If you need this to happen every single time on WhatsApp image upload, with exact fields like:

  • sourceMessageId
  • senderPhone
  • receivedAt

then prompting the agent is not the best path.

#

For that, use a message:preprocessed hook or plugin and build the payload deterministically there. Hooks are the automatic trigger path. MCP tools are just tools the agent can use during a turn.


Practical flow

Your best setup is:

  • WhatsApp image arrives
  • OpenClaw receives media
  • agent sees the image / media path
  • agent calls upload-image
  • agent gets mediaUrl
  • agent calls create_product_draft with:
{
#
  "storeId": 1,
  "sourceMessageId": "...",
  "senderPhone": "...",
  "receivedAt": "...",
  "name": "...",
  "description": "...",
  "categoryHint": "...",
  "confidence": 0.91,
  "images": [
    {
      "mediaUrl": "https://.../image.jpg",
      "sortOrder": 0,
      "primary": true,
      "approved": true
    }
#
  ]
}

My recommendation

  • If you want agentic behavior, use prompt instructions.
  • If you want guaranteed automation on every upload, use a hook/plugin.

If you want, I can write the exact:

  1. AGENTS.md instruction block, and
  2. minimal message:preprocessed hook skeleton

for this create_product_draft flow.

tame notch
#

I want guaranteed automation on every upload, use a hook/plugin. also only create_product_draft tool should be called with inline image bytes like :

{
"storeId": 1,
"sourceMessageId": "wa-12345",
"senderPhone": "+919999999999",
"receivedAt": "2026-04-17T12:45:00Z",
"name": "Silver Ring",
"images": [
{
"base64": "<base64-image>",
"mimeType": "image/jpeg",
"fileName": "ring.jpg",
"sortOrder": 0,
"primary": true,
"approved": true
}
]
}

obtuse dirgeBOT
#

Yep, for guaranteed automation on every WhatsApp image upload, use a message:preprocessed hook and call your MCP server from the hook.

That gives you deterministic behavior and bypasses β€œmaybe the LLM calls the tool, maybe not”.

Best shape

  • trigger: message:preprocessed
  • filter: channelId === "whatsapp"
  • filter: media exists and mediaType starts with image/
  • read context.mediaPath
  • base64 encode the image
  • call MCP tool create_product_draft once
  • pass:
    • sourceMessageId
    • senderPhone
    • receivedAt
    • images[0].base64
#

Hook files

hooks/whatsapp-product-draft/HOOK.md

---
name: whatsapp-product-draft
description: "Create product drafts automatically for inbound WhatsApp images"
metadata:
  openclaw:
    emoji: "πŸ›οΈ"
    events: ["message:preprocessed"]
---

# whatsapp-product-draft

hooks/whatsapp-product-draft/handler.ts

#
import fs from "node:fs/promises";
import path from "node:path";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";

type MsgCtx = {
  from?: string;
  body?: string;
  timestamp?: number;
  channelId?: string;
  messageId?: string;
  mediaPath?: string;
  mediaType?: string;
};
#
function cleanSenderPhone(value: string | undefined): string {
  if (!value) return "";
  return value.replace(/^whatsapp:/, "").trim();
}

function safeNameFromBody(body: string | undefined): string {
  const text = (body ?? "").trim();
  if (!text) return "Untitled product";
  if (text.startsWith("<media:")) return "Untitled product";
  if (text === "[Image]") return "Untitled product";
  return text;
}

export default async function handler(event: {
  type: string;
#
  action: string;
  timestamp: Date;
  context: MsgCtx;
  messages: string[];
}) {
  if (event.type !== "message" || event.action !== "preprocessed") return;

  const ctx = event.context;
  if (ctx.channelId !== "whatsapp") return;
  if (!ctx.mediaPath) return;
  if (!ctx.mediaType?.startsWith("image/")) return;

  const file = await fs.readFile(ctx.mediaPath);
  const base64 = file.toString("base64");
#
  const transport = new StdioClientTransport({
    command: "node",
    args: ["/home/atish/product-draft-mcp/src/server.js"],
    cwd: "/home/atish/product-draft-mcp",
    env: {
      ...process.env,
      API_BASE_URL: "https://services.aayuvyaa.in",
      REQUEST_TIMEOUT_MS: "15000",
    },
  });

  const client = new Client(
    { name: "whatsapp-product-draft-hook", version: "1.0.0" },
    { capabilities: {} },
  );
#

  await client.connect(transport);

  try {
    await client.callTool({
      name: "create_product_draft",
      arguments: {
        storeId: 1,
        sourceMessageId: ctx.messageId ?? "",
        senderPhone: cleanSenderPhone(ctx.from),
        receivedAt: event.timestamp.toISOString(),
        name: safeNameFromBody(ctx.body),
        images: [
          {
            base64,
#
            mimeType: ctx.mediaType,
            fileName: path.basename(ctx.mediaPath),
            sortOrder: 0,
            primary: true,
            approved: true,
          },
        ],
      },
    });
  } finally {
    await client.close();
  }
}

Enable it

#
openclaw hooks enable whatsapp-product-draft

Then restart the gateway.

Important note

If your backend expects name: "Silver Ring" to be inferred from the image, that inference must happen inside your backend/tool, because this hook is deterministic and not using the LLM.

If you want, you can also use the caption as name, which the example above already does when present.

One more practical thing

Make your backend idempotent on sourceMessageId. That way if WhatsApp ever redelivers or you restart during processing, you do not create duplicate drafts.

Docs: https://docs.openclaw.ai/automation/hooks

If you want, I can turn this into the exact final folder structure plus a safer version with retries and logging.

tame notch
#

yes. also

openclaw hooks enable whatsapp-product-draft

🦞 OpenClaw 2026.4.15 (041266a) β€” Your AI assistant, now without the $3,499 headset.

Error: Hook "whatsapp-product-draft" not found
atish@DESKTOP-P0NQSPA:~/.openclaw/workspace/hooks$ openclaw hooks disable block-all-whatsapp-outbound

🦞 OpenClaw 2026.4.15 (041266a) β€” Less clicking, more shipping, fewer "where did that file go" moments.

Error: Hook "block-all-whatsapp-outbound" is managed by plugin "invoice-suppress" and cannot be enabled/disabled.
atish@DESKTOP-P0NQSPA:~/.openclaw/workspace/hooks$ ls
invoice-whatsapp whatsapp-product-draft
atish@DESKTOP-P0NQSPA:~/.openclaw/workspace/hooks$ openclaw hooks enable whatsapp-product-draft

🦞 OpenClaw 2026.4.15 (041266a) β€” Say "stop" and I'll stopβ€”say "ship" and we'll both learn a lesson.

Error: Hook "whatsapp-product-draft" not found

obtuse dirgeBOT
#

Let me check the hook discovery rules, because this smells like either β€œwrong workspace” or the hook metadata file is not being recognized.

#

That error means OpenClaw is not discovering your workspace hook yet.

Most likely one of these is true:

  1. the hook is not under the real workspace.dir
  2. HOOK.md is not being parsed
  3. the folder is missing a valid handler file

Important

openclaw hooks enable ... does not look at your current shell directory.
It looks at <workspace.dir>/hooks/.

So first, make sure your hook is in the actual configured workspace, not just in whatever folder you cd’d into.


What the folder must look like

#
<workspace.dir>/hooks/whatsapp-product-draft/
  HOOK.md
  handler.ts

Accepted handler filenames are:

  • handler.ts
  • handler.js
  • index.ts
  • index.js

And the metadata file must be exactly HOOK.md.


#

Use this exact HOOK.md

I’d make it super strict to avoid frontmatter parsing issues:

---
name: whatsapp-product-draft
description: "Create product drafts automatically for inbound WhatsApp images"
metadata: {"openclaw":{"events":["message:preprocessed"]}}
---

# whatsapp-product-draft

That JSON-style metadata format is the safest one.


#

Then check discovery first

Run:

openclaw hooks list --verbose

If your hook is not listed there, enable will never work.

If it is listed, then run:

openclaw hooks enable whatsapp-product-draft

and restart the gateway.

#

If it still does not show up

Then the problem is almost certainly that your real workspace is not ~/.openclaw/workspace.

Check the workspace.dir value in your OpenClaw config, then put the hook under:

<that-workspace-dir>/hooks/whatsapp-product-draft/

Fast sanity check

Inside the hook directory, this should be true:

#
ls
# should show:
# HOOK.md
# handler.ts

Then:

openclaw hooks list --verbose

If you want, paste:

  • your HOOK.md
  • your handler.ts
  • the workspace.dir part of your config
#

and I’ll spot the exact issue quickly.

tame notch
#

this one is enabled.

I want to disable block-all-whatsapp-outbound. but getting

openclaw hooks list --verbose

🦞 OpenClaw 2026.4.15 (041266a) β€” Somewhere between 'hello world' and 'oh god what have I built.'

Hooks (7/8 ready)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Status β”‚ Hook β”‚ Description β”‚ Source β”‚ Missing β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ βœ“ ready β”‚ πŸš€ boot-md β”‚ Run BOOT.md on gateway startup β”‚ openclaw-bundled β”‚ β”‚
β”‚ βœ“ ready β”‚ πŸ“Ž bootstrap-extra-files β”‚ Inject additional workspace bootstrap files via glob/path patterns β”‚ openclaw-bundled β”‚ β”‚
β”‚ βœ“ ready β”‚ πŸ“ command-logger β”‚ Log all command events to a centralized audit file β”‚ openclaw-bundled β”‚ β”‚
β”‚ βœ“ ready β”‚ πŸ’Ύ session-memory β”‚ Save session context to memory when /new or /reset command is issued β”‚ openclaw-bundled β”‚ β”‚
β”‚ βœ“ ready β”‚ πŸ”— block-all-whatsapp-outbound β”‚ β”‚ plugin:invoice-suppress β”‚ β”‚
β”‚ βœ“ ready β”‚ πŸ”— memory-core-short-term-dreaming-cron β”‚ β”‚ plugin:memory-core β”‚ β”‚
β”‚ ⏸ disabled β”‚ πŸ”— invoice-whatsapp β”‚ Process WhatsApp invoice images β”‚ openclaw-workspace β”‚ β”‚
β”‚ βœ“ ready β”‚ πŸ›οΈ whatsapp-product-draft β”‚ Create product drafts automatically for inbound WhatsApp images β”‚ openclaw-workspace β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
atish@DESKTOP-P0NQSPA:~/.openclaw/workspace/hooks$ openclaw hooks disable block-all-whatsapp-outbound

🦞 OpenClaw 2026.4.15 (041266a) β€” I don't judge, but your missing API keys are absolutely judging you.

Error: Hook "block-all-whatsapp-outbound" is managed by plugin "invoice-suppress" and cannot be enabled/disabled.

obtuse dirgeBOT
#

Yep, that part is expected.

block-all-whatsapp-outbound is coming from:

  • Source: plugin:invoice-suppress

So it is a plugin-managed hook, not a workspace hook. Those cannot be toggled with:

openclaw hooks disable ...

What to do instead

Disable the owning plugin:

#
openclaw plugins disable invoice-suppress

Or in config:

{
  plugins: {
    entries: {
      "invoice-suppress": {
        enabled: false,
      },
    },
  },
}
#

Then restart the gateway.

Important

That disables the whole plugin, not just that one hook.

If you want to keep the plugin but disable only block-all-whatsapp-outbound, that is only possible if the plugin itself exposes a config flag for that behavior. Otherwise OpenClaw does not let you toggle plugin hooks one by one.

So the short answer is:

  • workspace hook β†’ openclaw hooks enable/disable ...
  • plugin hook β†’ openclaw plugins enable/disable ...

Docs: https://docs.openclaw.ai/cli/hooks and https://docs.openclaw.ai/tools/plugin

If you want, I can help you keep whatsapp-product-draft enabled while removing only the outbound blocking behavior cleanly.

tame notch
#

enabled whatsapp-product-draft but getting
I see you've attached an image. Here it is:

If you need anything else or have another request, feel free to let me know!

obtuse dirgeBOT
#

We hit the next gotcha. I’m checking the plugin hook path, because the behavior you pasted looks like the normal agent replied after your workspace hook ran, which means the hook is side-effecting but not owning the turn.

#

Yep, that output means the normal agent reply path is still running.

So what’s happening now is:

  1. your whatsapp-product-draft workspace hook may run
  2. then the regular assistant still sees the image
  3. it sends the default image-style reply

That reply:

![](./media/inbound/...)

is the giveaway. It’s the normal media reply pipeline, not your MCP flow.

Important correction

#

A workspace HOOK.md hook is good for side effects, but it does not automatically suppress the normal assistant reply.

So if you want:

  • guaranteed automation
  • call your backend/tool
  • no generic chat reply

then you need a plugin-level interception, not just a workspace hook.

What to use instead

Use a plugin hook that claims the turn and stops the normal model reply.

The relevant plugin hooks are:

  • before_dispatch
  • before_agent_reply
#

Those can return handled: true, which short-circuits the normal assistant response.

Practical recommendation

For your dedicated WhatsApp image channel, the clean setup is:

  • keep the processing logic
  • move it into a plugin
  • after calling create_product_draft, return handled with no reply
  • that prevents the β€œI see you've attached an image…” message

Short version

  • workspace hook = runs side effects, does not own the reply
  • plugin claim hook = can fully intercept and silence the turn

Temporary workaround

You can add an agent instruction like:

#

For this WhatsApp channel, when an image upload is processed by automation, reply with NO_REPLY.

But that is still model-driven, so I would not call it guaranteed.

Best next step

You’ve basically proven the hook discovery works. Now the next move is to convert this from a workspace hook into a plugin that claims the turn.

If you want, I can give you the minimal plugin skeleton for:

  • detect WhatsApp image
  • read inbound file
  • base64 encode it
  • call create_product_draft
  • return handled / silent reply
tame notch
obtuse dirgeBOT
#

You can use TOOLS.md for instructions/context, but not for guaranteed automation.

What TOOLS.md is good for

Use it to document things like:

  • tool name: create_product_draft
  • expected payload shape
  • store id
  • β€œfor WhatsApp image uploads, use inline base64 image bytes”
  • channel-specific notes

That helps the agent know what to do.

What it will not do

TOOLS.md does not:

  • auto-trigger on upload
#
  • force a tool call
  • suppress the normal assistant reply
  • replace a hook/plugin

So for your case:

  • TOOLS.md = guidance / documentation
  • AGENTS.md = behavioral instruction
  • plugin / hook = actual guaranteed execution

My recommendation

Yes, put the payload contract in TOOLS.md, but for the real β€œon every WhatsApp image upload, call create_product_draft and don’t chat back” behavior, use a plugin.

If you want, I can give you a good TOOLS.md snippet for this flow plus the matching plugin skeleton.

tame notch
#

yes please

obtuse dirgeBOT
#

I’m grabbing the minimal native plugin shape so I don’t hand you a fake manifest.

#

Yep, here’s the clean combo I’d use:

  • TOOLS.md documents the contract
  • your existing whatsapp-product-draft workspace hook does the actual MCP call
  • a tiny plugin suppresses the normal WhatsApp image chat reply

That avoids rewriting your processor logic.


1. TOOLS.md snippet

Put something like this in your global TOOLS.md:

## WhatsApp product draft automation
#

- Dedicated WhatsApp image-ingest flow uses the workspace hook `whatsapp-product-draft`.
- On inbound WhatsApp image uploads, automation sends the image to the MCP tool `create_product_draft`.
- The MCP payload must use inline image bytes, not a URL.

### MCP tool
- Tool name: `create_product_draft`

### Payload contract
```json
{
  "storeId": 1,
  "sourceMessageId": "wa-12345",
  "senderPhone": "+919999999999",
  "receivedAt": "2026-04-17T12:45:00Z",
  "name": "Silver Ring",
#

"images": [
{
"base64": "<base64-image>",
"mimeType": "image/jpeg",
"fileName": "ring.jpg",
"sortOrder": 0,
"primary": true,
"approved": true
}
]
}


### Behavior
- This flow is automation-first.
- For the dedicated WhatsApp ingest channel, normal assistant chatter should be suppressed.
#
- The reply path should be swallowed when the inbound message is an image upload intended for product-draft creation.

That’s good for documentation, but it does not enforce behavior by itself.


2. Minimal plugin to suppress the normal reply

This plugin does not do the MCP work.
It just stops the β€œI see you've attached an image…” response.

package.json

{
#
  "name": "@local/whatsapp-no-image-chat",
  "version": "1.0.0",
  "type": "module",
  "openclaw": {
    "extensions": ["./index.ts"],
    "compat": {
      "pluginApi": ">=2026.4.15",
      "minGatewayVersion": "2026.4.15"
    }
  }
}

openclaw.plugin.json

#
{
  "id": "whatsapp-no-image-chat",
  "name": "WhatsApp No Image Chat",
  "description": "Suppress normal assistant replies for dedicated WhatsApp image-ingest turns",
  "configSchema": {
    "type": "object",
    "additionalProperties": false,
    "properties": {}
  }
}

index.ts

import { definePluginEntry } from "openclaw/plugin-sdk/plugin-entry";
#

function looksLikeInboundImage(body?: string, content?: string): boolean {
  const text = `${body ?? ""}\n${content ?? ""}`;
  return (
    text.includes("<media:image>") ||
    text.includes("[Image]") ||
    text.includes("media/inbound/") ||
    /!\[\]\((\.\/)?media\/inbound\/.+\.(png|jpg|jpeg|webp|gif)\)/i.test(text)
  );
}

export default definePluginEntry({
  id: "whatsapp-no-image-chat",
  name: "WhatsApp No Image Chat",
  description: "Suppress normal assistant replies for dedicated WhatsApp image-ingest turns",
#
  register(api) {
    api.registerHook(
      "before_dispatch",
      async (event, ctx) => {
        if (ctx.channelId !== "whatsapp") return;

        // Optional: tighten this to only your dedicated route/session.
        // Example:
        // if (ctx.conversationId !== "YOUR_DEDICATED_CONVERSATION_ID") return;

        if (!looksLikeInboundImage(event.body, event.content)) return;

        // Swallow the normal assistant reply.
        return { handled: true };
      },
#
      { name: "suppress-whatsapp-image-chat" },
    );
  },
});

3. Why this combo works

  • your workspace hook keeps doing the real deterministic work
  • this plugin claims the turn before the model replies
  • result: no generic assistant image response

So you get:

#
  • guaranteed processing
  • no chatty reply
  • no need to move all the MCP logic into the plugin immediately

4. Install / enable

From the plugin directory:

openclaw plugins install .
openclaw plugins enable whatsapp-no-image-chat
openclaw gateway restart

If it’s already linked/installed, just enable it and restart.

#

5. One important warning

The sample plugin above suppresses all WhatsApp image-like turns unless you add a tighter guard.

So I’d strongly recommend adding one of these checks:

  • ctx.conversationId
  • ctx.sessionKey if available elsewhere in your logic
  • a dedicated sender / route / account guard

Otherwise it may suppress replies for other WhatsApp image conversations too.


If you want, next I can give you a tightened version of the plugin that only suppresses one dedicated WhatsApp route, plus a cleaned-up handler.ts for your existing whatsapp-product-draft hook.

tame notch
#

15:02:49+00:00 info gateway/channels/whatsapp/inbound {"subsystem":"gateway/channels/whatsapp/inbound"} Inbound message +919111292909 -> +919109411221 (direct, image/jpeg, 87 chars)
15:02:53+00:00 warn bundle-mcp {"subsystem":"bundle-mcp"} failed to start server "aayuvyaa-product-draft" (node /home/atish/product-draft-mcp/src/server.js (cwd=/home/atish/product-draft-mcp)): McpError: MCP error -32000: Connection closed

obtuse dirgeBOT
#

That log is the key:

failed to start server "aayuvyaa-product-draft" ... McpError: MCP error -32000: Connection closed

So this is not the WhatsApp part failing.
Your MCP server process is starting, then closing before OpenClaw can finish the stdio MCP handshake.

What it usually means

One of these is happening:

  1. your server crashes on startup
  2. it exits immediately after launch
  3. it writes normal logs to stdout instead of only MCP frames
  4. the path/env/cwd is wrong
#
  1. the file runs, but it is not actually staying alive as a stdio MCP server

First test

Run it manually with the same cwd and env OpenClaw uses:

cd /home/atish/product-draft-mcp
API_BASE_URL="https://services.aayuvyaa.in" REQUEST_TIMEOUT_MS="15000" node /home/atish/product-draft-mcp/src/server.js

If it exits immediately or prints a stack trace, that is the problem.


Very common MCP stdio mistakes

#

1. Logging to stdout

For stdio MCP, stdout must be protocol-only.

Bad:

console.log("server started");

Good:

console.error("server started");

Use stderr for logs, not stdout.


#

2. Server not actually connecting stdio transport

You need the equivalent of:

const transport = new StdioServerTransport();
await server.connect(transport);

and then let the process stay alive.

Do not process.exit() after startup.


3. Wrong runtime file

If src/server.js is not the real built output, OpenClaw may be launching the wrong file.

#

If you have a built file, prefer something like:

{
  "command": "node",
  "args": ["/home/atish/product-draft-mcp/dist/server.js"]
}

instead of src/server.js.


4. Crash from imports / env / ESM

A lot of β€œconnection closed” cases are just startup exceptions:

  • missing package
#
  • bad import
  • bad ESM/CommonJS config
  • required env missing
  • syntax/runtime error

What to check in your server

Make sure your MCP server:

  • uses StdioServerTransport
  • connects it with await server.connect(...)
  • does not print to stdout
  • does not exit after startup
  • does not throw before connect()

#

Best next step

Please run the manual command above and paste either:

  • the stderr output, or
  • the first 30 lines of the server file

Because right now OpenClaw is successfully trying to launch it, but your MCP process is dying before handshake. That’s the real bug.

tame notch
#

Failed to load hook whatsapp-product-draft: Cannot find package '@modelcontextprotocol/sdk' imported from /home/atish/.openclaw/workspace/hooks/whatsapp-product-draft/handler.ts

node:460622) [MODULE_TYPELESS_PACKAGE_JSON] Warning: Module type of file:///home/atish/.openclaw/workspace/hooks/whatsapp-product-draft/handler.ts?t=1776609344942.0754&s=2409 is not specified and it doesn't parse as CommonJS.
Reparsing as ES module because module syntax was detected. This incurs a performance overhead.
To eliminate this warning, add "type": "module" to /home/atish/.openclaw/package.json.

obtuse dirgeBOT
#

Nice, we found the actual blocker.

What this means

This is the fatal error:

Cannot find package '@modelcontextprotocol/sdk'
imported from /home/atish/.openclaw/workspace/hooks/whatsapp-product-draft/handler.ts

So the hook is loading from:

/home/atish/.openclaw/workspace/hooks/...
#

and Node is resolving imports from there, not from your MCP project at:

/home/atish/product-draft-mcp

So even if @modelcontextprotocol/sdk is installed in your MCP project, the hook still cannot see it.

The warning

This part is only a warning:

[MODULE_TYPELESS_PACKAGE_JSON]

Not the main failure.

#

Best fix, recommended

For a workspace hook, do not call the MCP server via the SDK.

Just call your backend API directly from the hook with fetch.
That is much simpler and avoids dependency/module-resolution pain.

So:

  • workspace hook reads image
  • base64 encodes it
  • sends JSON directly to your API
  • done

That is the cleanest path for guaranteed automation.

#

If you really want to keep MCP inside the hook

Then you need to install the SDK where the hook can resolve it, basically under ~/.openclaw.

Use pnpm, for example:

cd ~/.openclaw
pnpm add @modelcontextprotocol/sdk

And to remove the module warning, add a package.json there with:

{
#
  "type": "module"
}

Then restart the gateway.

But honestly, for this use case, I would not do that. Direct HTTP from the hook is better.


My recommendation

  • workspace hook β†’ call your API directly
  • plugin β†’ suppress the normal assistant reply

That is the least fragile setup.

#

If you want, I can rewrite your current whatsapp-product-draft/handler.ts to use plain fetch instead of @modelcontextprotocol/sdk.

tame notch
#

(node:465262) [DEP0040] DeprecationWarning: The punycode module is deprecated. Please use a userland alternative instead.
(Use node --trace-deprecation ... to show where the warning was created)

obtuse dirgeBOT
#

That one is just a Node deprecation warning, not the core failure.

Meaning

A dependency somewhere in your hook / plugin / MCP stack is still importing:

punycode

Node is warning that it’s deprecated.

Important

By itself, this usually does not break anything.

So if your process is still failing, the real issue is probably elsewhere. This warning is usually just noise.

If you want to trace it

#

You can run the failing process with:

node --trace-deprecation ...

and it will show which package is importing punycode.

Practical advice

  • if the hook/plugin/server still runs, you can ignore this for now
  • if you want to clean it up, update the dependency that is using punycode

So, short version: not the main bug.

tame notch
#

openclaw mcp set aayuvyaa-product-draft '{"command":"node","args":["C:\Users\SK.openclaw\mcp\product-draft-mcp\src\server.js"],"cwd":"C:\Users\SK.openclaw\mcp\product-draft-mcp","env":{"API_BASE_URL":"https://services.aayuvyaa.in","REQUEST_TIMEOUT_MS":"15000"}}'

🦞 OpenClaw 2026.4.15 (041266a) β€” More integrations than your therapist's intake form.

Invalid JSON: SyntaxError: Expected property name or '}' in JSON at position 1 (line 1 column 2)

obtuse dirgeBOT
#

This is a Windows quoting + path escaping issue.

Two problems in your command:

  1. C:\... inside JSON needs escaping, or use /
  2. depending on shell, the outer '...' may not be handled the way you expect

Easiest fix

Use forward slashes in the path.

PowerShell

openclaw mcp set aayuvyaa-product-draft '{"command":"node","args":["C:/Users/SK/.openclaw/mcp/product-draft-mcp/src/server.js"],"cwd":"C:/Users/SK/.openclaw/mcp/product-draft-mcp","env":{"API_BASE_URL":"https://services.aayuvyaa.in","REQUEST_TIMEOUT_MS":"15000"}}'

CMD

#
openclaw mcp set aayuvyaa-product-draft "{\"command\":\"node\",\"args\":[\"C:/Users/SK/.openclaw/mcp/product-draft-mcp/src/server.js\"],\"cwd\":\"C:/Users/SK/.openclaw/mcp/product-draft-mcp\",\"env\":{\"API_BASE_URL\":\"https://services.aayuvyaa.in\",\"REQUEST_TIMEOUT_MS\":\"15000\"}}"

Why your version failed

This part is invalid JSON as written:

"C:\Users\SK\..."

because \U / \S etc. are not valid JSON escapes.

Alternative

You can also add it directly in your OpenClaw config under:

#
{
  mcp: {
    servers: {
      "aayuvyaa-product-draft": {
        command: "node",
        args: ["C:/Users/SK/.openclaw/mcp/product-draft-mcp/src/server.js"],
        cwd: "C:/Users/SK/.openclaw/mcp/product-draft-mcp",
        env: {
          API_BASE_URL: "https://services.aayuvyaa.in",
          REQUEST_TIMEOUT_MS: "15000"
        }
      }
    }
  }
}
#

Then verify with:

openclaw mcp show aayuvyaa-product-draft --json

If you want, I can also give you the exact version for Git Bash / WSL.

tame notch
#

import fs from "node:fs/promises";
import path from "node:path";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";

type MsgCtx = {
from?: string;
body?: string;
timestamp?: number;
channelId?: string;
messageId?: string;
mediaPath?: string;
mediaType?: string;
};
function cleanSenderPhone(value: string | undefined): string {
if (!value) return "";
return value.replace(/^whatsapp:/, "").trim();
}

function safeNameFromBody(body: string | undefined): string {
const text = (body ?? "").trim();
if (!text) return "Untitled product";
if (text.startsWith("<media:")) return "Untitled product";
if (text === "[Image]") return "Untitled product";
return text;
}

export default async function handler(event: {
type: string;
action: string;
timestamp: Date;
context: MsgCtx;
messages: string[];
}) {
if (event.type !== "message" || event.action !== "preprocessed") return;

const ctx = event.context;
if (ctx.channelId !== "whatsapp") return;
if (!ctx.mediaPath) return;
if (!ctx.mediaType?.startsWith("image/")) return;

const file = await fs.readFile(ctx.mediaPath);
const base64 = file.toString("base64");
const transport = new StdioClientTransport({
command: "node",
args: ["/home/atish/product-draft-mcp/src/server.js"],
cwd: "/home/atish/product-draft-mcp",
env: {
...process.env,
API_BASE_URL: "https://services.aayuvyaa.in",
REQUEST_TIMEOUT_MS: "15000",
},
});

const client = new Client(
{ name: "whatsapp-product-draft-hook", version: "1.0.0" },
{ capabilities: {} },
);
await client.connect(transport);

try {
await client.callTool({
name: "create_product_draft",
arguments: {
storeId: 1,
sourceMessageId: ctx.messageId ?? "",
senderPhone: cleanSenderPhone(ctx.from),
receivedAt: event.timestamp.toISOString(),
name: safeNameFromBody(ctx.body),
images: [
{
base64,
mimeType: ctx.mediaType,
fileName: path.basename(ctx.mediaPath),
sortOrder: 0,
primary: true,
approved: true,
},
],
},
});
} finally {
await client.close();
}
}

extract SP/price, purchase/pp, qty/QTY/quantity from image caption and pass in client.callTool

obtuse dirgeBOT
#

Yep. Add a small caption parser and then spread the parsed fields into client.callTool.

Use this version:

import fs from "node:fs/promises";
import path from "node:path";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";

type MsgCtx = {
  from?: string;
  body?: string;
  timestamp?: number;
  channelId?: string;
  messageId?: string;
#
  mediaPath?: string;
  mediaType?: string;
};

function cleanSenderPhone(value: string | undefined): string {
  if (!value) return "";
  return value.replace(/^whatsapp:/, "").trim();
}

function isPlaceholderBody(body: string | undefined): boolean {
  const text = (body ?? "").trim();
  return !text || text.startsWith("<media:") || text === "[Image]";
}

function parseNumber(value: string | undefined): number | undefined {
#
  if (!value) return undefined;
  const normalized = value.replace(/[,\sβ‚Ή]/g, "");
  const num = Number(normalized);
  return Number.isFinite(num) ? num : undefined;
}

function extractCaptionFields(body: string | undefined): {
  name: string;
  sellingPrice?: number;
  purchasePrice?: number;
  quantity?: number;
} {
  const raw = (body ?? "").trim();

  if (isPlaceholderBody(raw)) {
#
    return { name: "Untitled product" };
  }

  const purchaseRegex =
    /\b(?:pp|purchase(?:\s*price)?)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*([0-9][0-9,]*)\b/i;

  const qtyRegex =
    /\b(?:qty|quantity)\b\s*[:=\-]?\s*([0-9][0-9,]*)\b/i;

  const purchaseMatch = raw.match(purchaseRegex);
  const purchasePrice = parseNumber(purchaseMatch?.[1]);

  // Remove purchase fragment first so generic "price" does not capture it.
  const withoutPurchase = raw.replace(
    /\b(?:pp|purchase(?:\s*price)?)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*[0-9][0-9,]*\b/gi,
#
    " ",
  );

  const sellingRegex =
    /\b(?:sp|selling(?:\s*price)?|sale(?:\s*price)?|price)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*([0-9][0-9,]*)\b/i;

  const sellingMatch = withoutPurchase.match(sellingRegex);
  const sellingPrice = parseNumber(sellingMatch?.[1]);

  const qtyMatch = raw.match(qtyRegex);
  const quantity = parseNumber(qtyMatch?.[1]);

  const cleanedName = raw
    .replace(/\b(?:pp|purchase(?:\s*price)?)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*[0-9][0-9,]*\b/gi, " ")
    .replace(/\b(?:sp|selling(?:\s*price)?|sale(?:\s*price)?|price)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*[0-9][0-9,]*\b/gi, " ")
#
    .replace(/\b(?:qty|quantity)\b\s*[:=\-]?\s*[0-9][0-9,]*\b/gi, " ")
    .replace(/[|,;]+/g, " ")
    .replace(/\s{2,}/g, " ")
    .trim();

  return {
    name: cleanedName || "Untitled product",
    ...(sellingPrice !== undefined ? { sellingPrice } : {}),
    ...(purchasePrice !== undefined ? { purchasePrice } : {}),
    ...(quantity !== undefined ? { quantity } : {}),
  };
}

export default async function handler(event: {
  type: string;
#
  action: string;
  timestamp: Date;
  context: MsgCtx;
  messages: string[];
}) {
  if (event.type !== "message" || event.action !== "preprocessed") return;

  const ctx = event.context;
  if (ctx.channelId !== "whatsapp") return;
  if (!ctx.mediaPath) return;
  if (!ctx.mediaType?.startsWith("image/")) return;

  const parsed = extractCaptionFields(ctx.body);

  const file = await fs.readFile(ctx.mediaPath);
#
  const base64 = file.toString("base64");

  const transport = new StdioClientTransport({
    command: "node",
    args: ["/home/atish/product-draft-mcp/src/server.js"],
    cwd: "/home/atish/product-draft-mcp",
    env: {
      ...process.env,
      API_BASE_URL: "https://services.aayuvyaa.in",
      REQUEST_TIMEOUT_MS: "15000",
    },
  });

  const client = new Client(
    { name: "whatsapp-product-draft-hook", version: "1.0.0" },
#
    { capabilities: {} },
  );

  await client.connect(transport);

  try {
    await client.callTool({
      name: "create_product_draft",
      arguments: {
        storeId: 1,
        sourceMessageId: ctx.messageId ?? "",
        senderPhone: cleanSenderPhone(ctx.from),
        receivedAt: event.timestamp.toISOString(),
        name: parsed.name,
        ...(parsed.sellingPrice !== undefined ? { sellingPrice: parsed.sellingPrice } : {}),
#
        ...(parsed.purchasePrice !== undefined ? { purchasePrice: parsed.purchasePrice } : {}),
        ...(parsed.quantity !== undefined ? { quantity: parsed.quantity } : {}),
        images: [
          {
            base64,
            mimeType: ctx.mediaType,
            fileName: path.basename(ctx.mediaPath),
            sortOrder: 0,
            primary: true,
            approved: true,
          },
        ],
      },
    });
  } finally {
#
    await client.close();
  }
}

Examples it will catch

  • Silver Ring SP 2499 PP 1800 Qty 2
  • Silver Ring price: 2499 purchase: 1800 quantity: 2
  • Silver Ring | sp 2499 | pp 1800 | qty 2

Note

This extracts only from the caption/body text, not OCR from the image itself. If you want OCR too, that’s a separate step.

tame notch
#

I am getting in logs
12:17:37 [agents/tool-images] Image resized to fit limits: 1600x1200px 174.0KB -> 130.8KB (-24.8%)
12:17:41 [agents/tool-images] Image resized to fit limits: \Users\SK.openclaw\media\inbound\c5eec234-1501-4eef-9fae-c5ccf4e6d619.jpg 1600x1200px 176.3KB -> 131.0KB (-25.7%)
in logs and uploaded image is square shaped grey filled not the original one

obtuse dirgeBOT
#

I’m checking whether OpenClaw is mutating the inbound image in place or just making a model-side resized copy, because that decides whether the bug is in OpenClaw’s image prep or in your hook/backend path.

#

That log is coming from OpenClaw’s model-side image sanitizer, not from your hook’s raw file read.

What it means

OpenClaw is shrinking the image before sending it to the LLM:

  • default max side is 1200px
  • oversize images get recompressed
  • the sanitizer converts to JPEG

So this log:

[agents/tool-images] Image resized to fit limits ...

means β€œimage prepared for model input”, not β€œoriginal inbound file overwritten”.

#

Important

Your hook does:

const file = await fs.readFile(ctx.mediaPath);

That should still read the original inbound file from disk.

So if your backend is getting a gray square, there are two likely cases:

Case 1, normal assistant/media path

The square gray image is from the assistant/model rendering path, not your hook upload.

Case 2, your downstream API/storage path

Your hook is sending raw base64 correctly, but your API/storage layer is flattening / transforming it.

#

Why square gray backgrounds happen

A very common reason is image conversion / flattening, especially when transparency is involved.

OpenClaw’s sanitizer converts to JPEG for the model path, and JPEG has no alpha.

That can make an image look like a square filled background instead of the original transparent/soft-edged version.


What I’d do

1. Keep using ctx.mediaPath for your hook

That is the right path if you want original bytes.

2. Verify what your hook is actually reading

Quick test: save a copy of the exact file your hook reads, before callTool.

#
await fs.copyFile(ctx.mediaPath, `/tmp/debug-${path.basename(ctx.mediaPath)}`);

Then inspect that copied file.

  • If the copied file looks correct, OpenClaw is fine and the problem is after the hook
  • If the copied file is already wrong, then the issue is before your API call

3. Do not rely on the assistant-rendered image

That path can be resized/recompressed for model safety.


Short version

  • agents/tool-images log = LLM input resize
  • it does not necessarily mean your original inbound file was changed
#
  • if your final uploaded image is wrong, verify the bytes from ctx.mediaPath first

If you want, I can give you a tiny debug patch for your hook that logs:

  • original file size
  • copied debug image path
  • base64 length

so you can pinpoint whether the corruption is in OpenClaw or in your backend.

tame notch
#

I think const base64 = file.toString("base64"); is not generating correct base64 for the image

obtuse dirgeBOT
#

file.toString("base64") is the standard correct way to generate base64 from an image buffer.

So this line is almost certainly not the problem by itself:

const base64 = file.toString("base64");

Quick proof

Add this round-trip check:

const file = await fs.readFile(ctx.mediaPath);
const base64 = file.toString("base64");
const decoded = Buffer.from(base64, "base64");
#
console.log("same bytes?", file.equals(decoded));
console.log("orig bytes", file.length, "decoded bytes", decoded.length);

await fs.writeFile("/tmp/roundtrip-check.jpg", decoded);

If:

file.equals(decoded) === true

then the base64 is correct.

Most likely real issue

Usually one of these is wrong instead:

#
  1. your API expects a data URL
    example:

    `data:${ctx.mediaType};base64,${base64}`
    

    instead of raw base64

  2. your API expects raw base64 only, but you’re sending a data URL

  3. your downstream service is transforming the image after decode

  4. the wrong mime type / filename is being used

In your current payload

Since you send:

#
{
  base64,
  mimeType: ctx.mediaType,
  fileName: path.basename(ctx.mediaPath)
}

raw base64 is usually correct.

So I’d bet the problem is downstream decode/storage, not the toString("base64") call.

Best next test

Do this:

  • write /tmp/roundtrip-check.jpg
  • open it
#
  • if it looks correct, your base64 is fine

If you want, I can give you a tiny debug block that also prints:

  • mime type
  • first 40 chars of base64
  • image hash

so you can prove exactly where the corruption starts.

tame notch
#

I need to read the new image file to convert it to base64 for the product draft.

12:50:25 [whatsapp] Inbound message +919111292909 -> +919109411221 (direct, image/jpeg, 119 chars)
12:50:25 same bytes? true
12:50:25 orig bytes 180536 decoded bytes 180536
12:50:59 [agents/tool-images] Image resized to fit limits: 1600x1200px 174.0KB -> 130.8KB (-24.8%)
12:51:03 [agents/tool-images] Image resized to fit limits: \Users\SK.openclaw\media\inbound\efd55d44-fca4-42a0-811b-ca55c4ba637e.jpg 1600x1200px 176.3KB -> 131.0KB (-25.7%)

obtuse dirgeBOT
#

You’re already reading the new inbound image file correctly.

This proves it:

same bytes? true
orig bytes 180536 decoded bytes 180536

So for the product draft, this is the right path:

const file = await fs.readFile(ctx.mediaPath);
const base64 = file.toString("base64");

What the later resize logs mean

#

These lines:

[agents/tool-images] Image resized to fit limits ...

are a separate OpenClaw model-prep step.

They happen because the normal assistant/image pipeline is also touching the inbound image later for LLM use.

That is not your hook’s base64 conversion.

Short version

  • ctx.mediaPath = the new inbound image file you want
  • your base64 conversion is correct
  • the resize logs are unrelated to your payload generation
#

Best practice

If you want to be extra safe, read or copy the file immediately in the hook:

const srcPath = ctx.mediaPath;
const file = await fs.readFile(srcPath);
const base64 = file.toString("base64");

or preserve a copy:

const debugCopy = path.join("/tmp", path.basename(ctx.mediaPath));
await fs.copyFile(ctx.mediaPath, debugCopy);

If you want those resize logs to stop

#

That means the regular assistant/image path is still running.

So you need the reply-suppression plugin (or equivalent turn-claiming plugin) so OpenClaw does not also process the image for chat reply.

So the conclusion here is actually good news:
your base64 path is fine, the extra logs are from the normal assistant pipeline.

tame notch
#

then why uploaded image is only grey filled square shape?

obtuse dirgeBOT
#

Because if the base64 round-trip is correct, the gray square is happening after that step.

So the problem is almost certainly in one of these downstream places:

  1. your API decode step
  2. your upload/storage step
  3. your thumbnail/rendering step

Why I’m confident

You already proved:

same bytes? true
orig bytes 180536 decoded bytes 180536

That means:

#
  • the file read is correct
  • toString("base64") is correct
  • Buffer.from(base64, "base64") gives the same bytes back

So the corruption is not in the base64 generation.


Most likely causes now

1. Receiver-side decode is wrong

Make sure your API decodes like this:

const buffer = Buffer.from(base64, "base64");
#

and not via some other transform.

2. Your backend/upload service is generating a bad thumbnail

A lot of systems create a square preview/thumb and that can end up as a gray box while the original file is actually fine.

3. The frontend is rendering a preview incorrectly

For example:

  • square thumbnail crop
  • gray placeholder background
  • failed image decode but fallback box shown

4. The API expects a different base64 format

Some APIs want:

data:image/jpeg;base64,....
#

instead of raw base64.

Your current payload sends raw base64 plus separate mimeType, which is fine only if the receiver expects that.


Best next test

On the receiving side, before upload/storage, write the received base64 back to a file:

const buffer = Buffer.from(payload.images[0].base64, "base64");
await fs.writeFile("server-debug.jpg", buffer);

Then inspect server-debug.jpg.

If server-debug.jpg looks correct

#

Your hook + base64 are fine.
The bug is in upload/storage/thumbnail/UI.

If server-debug.jpg is already gray

Then the receiver is interpreting the payload wrongly.


Short version

At this point, the gray square is not from:

file.toString("base64")

It is almost certainly from what happens after your API receives it.

#

If you want, paste the receiver-side decode/upload code and I’ll point to the exact spot.

tame notch
#

API:

@PostMapping(
"/media",
consumes = [MediaType.MULTIPART_FORM_DATA_VALUE],
)
@ResponseStatus(HttpStatus.CREATED)
fun uploadMedia(
@RequestParam storeId: Long,
@RequestParam("file") file: MultipartFile,
): BridgeProductDraftMediaUploadResponse {
val result = bridgeMediaStorageService.uploadMedia(file, storeId)
return BridgeProductDraftMediaUploadResponse(
mediaUrl = result.mediaUrl,
mimeType = result.mimeType,
)
}

js:

const uploaded = await uploadProductDraftMedia({
storeId: input.storeId,
fileName: image.fileName || draft-image-${index + 1}.jpg,
mimeType: image.mimeType || "image/jpeg",
buffer: Buffer.from(image.base64, "base64")
});

  normalizedImages.push({
    imageUrl: uploaded.mediaUrl,
    sortOrder: image.sortOrder ?? index,
    primaryImage: image.primary ?? index === 0,
    approved: image.approved ?? true
  });
}
obtuse dirgeBOT
#

Your Spring controller looks fine, and this line also looks fine:

buffer: Buffer.from(image.base64, "base64")

So the suspicious part is now uploadProductDraftMedia(...), not the controller and not the base64 decode.

Why

Your API expects:

@RequestParam("file") file: MultipartFile

That means the JS side must send a real multipart binary file part, not:

#
  • a base64 string field
  • a UTF-8 stringified buffer
  • a malformed FormData append

If that helper wraps the buffer incorrectly, Spring still gets a MultipartFile, but the bytes can be wrong.


Most likely bug

Inside uploadProductDraftMedia, check whether you are doing something like:

form.append("file", buffer.toString())

or

#
form.append("file", image.base64)

That would corrupt the file.

You want the multipart field to carry the raw binary buffer.


Correct Node upload pattern

If using built-in fetch / FormData

const form = new FormData();
form.set("storeId", String(storeId));
#
form.set(
  "file",
  new Blob([buffer], { type: mimeType }),
  fileName,
);

await fetch(`${baseUrl}/media`, {
  method: "POST",
  body: form,
});

If using form-data package

const form = new FormData();
#
form.append("storeId", String(storeId));
form.append("file", buffer, {
  filename: fileName,
  contentType: mimeType,
});

That is the important part.


Best place to debug now

Please inspect the implementation of:

#
uploadProductDraftMedia(...)

That is where I’d expect the bug.


Server-side sanity check

On the Spring side, log these before storage:

println("name=${file.originalFilename}")
println("type=${file.contentType}")
println("size=${file.size}")
#

For your example, file.size should be very close to the original bytes, around:

180536

If Spring receives a much larger or much smaller file, or obviously wrong size, the multipart construction is wrong.


My read right now

  • base64 generation: looks correct
  • controller: looks fine
  • likely bug: multipart upload helper is not sending the binary file part correctly

If you paste uploadProductDraftMedia(...), I can probably point to the exact broken line.

tame notch
#

export async function uploadProductDraftMedia({ storeId, fileName, mimeType, buffer }) {
const form = new FormData();
form.append("storeId", String(storeId));
form.append("file", buffer, {
filename: fileName || "draft-image.bin",
contentType: mimeType || "application/octet-stream"
});

const response = await client.post("/api/v1/bridge/product-drafts/media", form, {
headers: form.getHeaders()
});
return response.data;
}

obtuse dirgeBOT
#

This upload helper looks basically correct.

form.append("file", buffer, {
  filename: fileName || "draft-image.bin",
  contentType: mimeType || "application/octet-stream"
});

So at this point I would stop blaming the Node side.

My read now

You’ve already shown:

  • base64 round-trip is correct
  • multipart upload code looks correct
  • Spring controller is normal
#

So the gray square is now most likely happening in:

  1. bridgeMediaStorageService.uploadMedia(file, storeId)
  2. whatever storage/CDN layer serves mediaUrl
  3. the UI/thumbnail renderer that displays the uploaded image

Next decisive test

On the Spring side, before calling storage, write the exact uploaded bytes to disk:

@PostMapping(
    "/media",
    consumes = [MediaType.MULTIPART_FORM_DATA_VALUE],
)
#
@ResponseStatus(HttpStatus.CREATED)
fun uploadMedia(
    @RequestParam storeId: Long,
    @RequestParam("file") file: MultipartFile,
): BridgeProductDraftMediaUploadResponse {
    println("name=${file.originalFilename}")
    println("type=${file.contentType}")
    println("size=${file.size}")

    java.nio.file.Files.write(
        java.nio.file.Path.of("/tmp/server-debug.jpg"),
        file.bytes
    )

    val result = bridgeMediaStorageService.uploadMedia(file, storeId)
#
    return BridgeProductDraftMediaUploadResponse(
        mediaUrl = result.mediaUrl,
        mimeType = result.mimeType,
    )
}

How to interpret it

If /tmp/server-debug.jpg looks correct

Then:

  • upload helper is fine
  • Spring multipart is fine
  • problem is in bridgeMediaStorageService or after storage
#

If /tmp/server-debug.jpg is already gray/wrong

Then we look again at transport/upload, but honestly your JS looks okay.


One more strong clue

If the file size on the server is about the same as the original, around:

180536 bytes

that’s another sign the upload is arriving intact.


So the likely bug is here

#

I would inspect this next:

bridgeMediaStorageService.uploadMedia(file, storeId)

Especially if it:

  • converts format
  • generates thumbnails
  • pads to square
  • runs image compression
  • uses Sharp/ImageMagick/thumbnailator/etc.
  • uploads to a service that auto-processes images

If you paste bridgeMediaStorageService.uploadMedia(...), I can probably tell you where the gray square is coming from.

tame notch
#

fun uploadMedia(file: MultipartFile, storeId: Long): BridgeMediaUploadResponse {
require(!file.isEmpty) { "Media file is required" }

    val contentType = (file.contentType ?: "").trim().lowercase(Locale.ENGLISH)
    val isSupported = contentType == "application/pdf" || contentType.startsWith("image/")
    require(isSupported) { "Only PDF or image uploads are allowed" }
    require(file.size <= MAX_MEDIA_FILE_SIZE_BYTES) { "Media file size cannot exceed 15 MB" }

    val objectKey = buildObjectKey(storeId, file.originalFilename)
    val putRequest = PutObjectRequest.builder()
        .bucket(spacesProperties.bucket)
        .key(objectKey)
        .contentType(contentType.ifBlank { "application/octet-stream" })
        .acl(ObjectCannedACL.PUBLIC_READ)
        .build()

    try {
        file.inputStream.use { input ->
            spacesS3Client.putObject(putRequest, RequestBody.fromInputStream(input, file.size))
        }
    } catch (ex: Exception) {
        throw ResponseStatusException(HttpStatus.INTERNAL_SERVER_ERROR, "Unable to upload bridge media", ex)
    }

I have seen my storage,

1 hr ago when I uploaded 1 it was created correct having size : 176.3kb , but now creating image of 407 bytes.
I have not changed anything except handler.ts with

try {
await client.callTool({
name: "create_product_draft",
arguments: {
storeId: 1,
sourceMessageId: ctx.messageId ?? "",
senderPhone: cleanSenderPhone(ctx.from),
receivedAt: event.timestamp.toISOString(),
name: parsed.name,
sellingPrice: (parsed.sellingPrice !== undefined ? { sellingPrice: parsed.sellingPrice } : {}),
purchasePrice: (parsed.purchasePrice !== undefined ? { purchasePrice: parsed.purchasePrice } : {}),
quantity: (parsed.quantity !== undefined ? { quantity: parsed.quantity } : {}),
images: [
{
base64,
mimeType: ctx.mediaType,
fileName: path.basename(ctx.mediaPath),
sortOrder: 0,
primary: true,
approved: true,
},
],
},
});
} finally {
await client.close();
}

obtuse dirgeBOT
#

I think I see the bug.

You changed the payload to this:

sellingPrice: (parsed.sellingPrice !== undefined ? { sellingPrice: parsed.sellingPrice } : {}),
purchasePrice: (parsed.purchasePrice !== undefined ? { purchasePrice: parsed.purchasePrice } : {}),
quantity: (parsed.quantity !== undefined ? { quantity: parsed.quantity } : {}),

That is wrong.

You are now sending:

  • sellingPrice = { sellingPrice: 2499 } instead of 2499
  • purchasePrice = { purchasePrice: 1800 } instead of 1800
  • quantity = { quantity: 2 } instead of 2
#

Correct version

It must be:

await client.callTool({
  name: "create_product_draft",
  arguments: {
    storeId: 1,
    sourceMessageId: ctx.messageId ?? "",
    senderPhone: cleanSenderPhone(ctx.from),
    receivedAt: event.timestamp.toISOString(),
    name: parsed.name,
    ...(parsed.sellingPrice !== undefined ? { sellingPrice: parsed.sellingPrice } : {}),
    ...(parsed.purchasePrice !== undefined ? { purchasePrice: parsed.purchasePrice } : {}),
    ...(parsed.quantity !== undefined ? { quantity: parsed.quantity } : {}),
    images: [
#
      {
        base64,
        mimeType: ctx.mediaType,
        fileName: path.basename(ctx.mediaPath),
        sortOrder: 0,
        primary: true,
        approved: true,
      },
    ],
  },
});

The ... matters.

Why this is the prime suspect

#

You said:

  • upload used to create 176.3KB
  • after changing handler.ts, it started creating 407 bytes

That strongly points to the new payload shape breaking something in your tool / normalization path.

Next check

After fixing the spread, also log inside your MCP tool before upload:

console.log("base64 length", input.images?.[0]?.base64?.length);
console.log("decoded bytes", Buffer.from(input.images[0].base64, "base64").length);
console.log("sellingPrice", input.sellingPrice, typeof input.sellingPrice);
console.log("purchasePrice", input.purchasePrice, typeof input.purchasePrice);
console.log("quantity", input.quantity, typeof input.quantity);
#

You want:

  • decoded bytes around 180536
  • sellingPrice as number
  • purchasePrice as number
  • quantity as number

So first fix the ... spread bug. I’m pretty sure that’s the bad change.

tame notch
#

after changing handler.ts uploaded file is the original image. but sellingPrice,purchasePrice is not entered in the api. name of product

[WhatsApp +919111292909 +22m Mon 2026-04-20 13:13 GMT+5:30] Dual heart enamel with pearl earrings

also

console.log("base64 length", input.images?.[0]?.base64?.length);
console.log("decoded bytes", Buffer.from(input.images[0].base64, "base64").length);
console.log("sellingPrice", input.sellingPrice, typeof input.sellingPrice);
console.log("purchasePrice", input.purchasePrice, typeof input.purchasePrice);
console.log("quantity", input.quantity, typeof input.quantity);

is server.js not printed in openclaw logs

#

my bad. just checked carefully, everything is correct except product name.

#

my bad. just checked carefully, everything is correct except product name.

[WhatsApp +919111292909 +22m Mon 2026-04-20 13:13 GMT+5:30] Dual heart enamel with pearl earrings

#

are you there?

obtuse dirgeBOT
#

Good, that means the image path is fixed πŸŽ‰

Now for the missing prices:

1. Your sample caption has no price fields

You showed this body:

[WhatsApp +919111292909 +22m Mon 2026-04-20 13:13 GMT+5:30] Dual heart enamel with pearl earrings

That contains only the product name.
It does not contain:

  • SP
  • price
  • PP
#
  • purchase
  • qty

So the parser will correctly return:

  • name = product name
  • sellingPrice = undefined
  • purchasePrice = undefined
  • quantity = undefined

If the prices are written inside the image, that is OCR/vision, not caption parsing.


2. Strip the WhatsApp prefix from the name

Right now the body includes the metadata prefix. Clean it first:

#
function normalizeCaption(body: string | undefined): string {
  const text = (body ?? "").trim();
  if (!text) return "";
  return text.replace(/^\[WhatsApp[^\]]*\]\s*/i, "").trim();
}

Then use that inside your parser.

Example

function extractCaptionFields(body: string | undefined) {
  const raw = normalizeCaption(body);

  if (!raw) {
#
    return { name: "Untitled product" };
  }

  const purchaseRegex =
    /\b(?:pp|purchase(?:\s*price)?)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*([0-9][0-9,]*)\b/i;

  const qtyRegex =
    /\b(?:qty|quantity)\b\s*[:=\-]?\s*([0-9][0-9,]*)\b/i;

  const purchaseMatch = raw.match(purchaseRegex);
  const purchasePrice = parseNumber(purchaseMatch?.[1]);

  const withoutPurchase = raw.replace(
    /\b(?:pp|purchase(?:\s*price)?)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*[0-9][0-9,]*\b/gi,
    " ",
#
  );

  const sellingRegex =
    /\b(?:sp|selling(?:\s*price)?|sale(?:\s*price)?|price)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*([0-9][0-9,]*)\b/i;

  const sellingMatch = withoutPurchase.match(sellingRegex);
  const sellingPrice = parseNumber(sellingMatch?.[1]);

  const qtyMatch = raw.match(qtyRegex);
  const quantity = parseNumber(qtyMatch?.[1]);

  const name = raw
    .replace(/\b(?:pp|purchase(?:\s*price)?)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*[0-9][0-9,]*\b/gi, " ")
    .replace(/\b(?:sp|selling(?:\s*price)?|sale(?:\s*price)?|price)\b\s*[:=\-]?\s*(?:rs\.?|β‚Ή)?\s*[0-9][0-9,]*\b/gi, " ")
    .replace(/\b(?:qty|quantity)\b\s*[:=\-]?\s*[0-9][0-9,]*\b/gi, " ")
#
    .replace(/[|,;]+/g, " ")
    .replace(/\s{2,}/g, " ")
    .trim();

  return {
    name: name || "Untitled product",
    ...(sellingPrice !== undefined ? { sellingPrice } : {}),
    ...(purchasePrice !== undefined ? { purchasePrice } : {}),
    ...(quantity !== undefined ? { quantity } : {}),
  };
}

3. Why console.log from server.js is not showing

#

Because your MCP server is using stdio.

With stdio MCP:

  • stdout is reserved for the MCP protocol
  • console.log() writes to stdout
  • that is the wrong place for debugging

Use:

console.error("sellingPrice", input.sellingPrice, typeof input.sellingPrice);

instead of console.log(...)

or log to a file.

#

Use this in server.js

console.error("base64 length", input.images?.[0]?.base64?.length);
console.error("decoded bytes", Buffer.from(input.images[0].base64, "base64").length);
console.error("sellingPrice", input.sellingPrice, typeof input.sellingPrice);
console.error("purchasePrice", input.purchasePrice, typeof input.purchasePrice);
console.error("quantity", input.quantity, typeof input.quantity);

4. Important distinction

If you want SP/PP/qty from:

  • caption text: current regex approach is fine
  • text printed inside the image: you need OCR / vision in backend or another tool step
#

If you want, send me one real caption example that includes SP/PP/qty, and I’ll tighten the regex exactly to your format.

#

Nice, glad we caught it.

Then the only fix you need is to strip the WhatsApp transcript prefix before building name.

Replace name extraction with this

function normalizeCaption(body: string | undefined): string {
  const text = (body ?? "").trim();
  if (!text) return "";
  return text.replace(/^\[WhatsApp[^\]]*\]\s*/i, "").trim();
}

function safeNameFromBody(body: string | undefined): string {
  const text = normalizeCaption(body);
  if (!text) return "Untitled product";
#
  if (text.startsWith("<media:")) return "Untitled product";
  if (text === "[Image]") return "Untitled product";
  return text;
}

And in your parser, use:

const raw = normalizeCaption(body);

instead of raw body.

Why

Your current body is something like:

#
[WhatsApp +919111292909 +22m Mon 2026-04-20 13:13 GMT+5:30] Dual heart enamel with pearl earrings

You only want:

Dual heart enamel with pearl earrings

If you want the quickest fix

Even just this is enough:

name: safeNameFromBody(ctx.body),
#

with the updated safeNameFromBody.

If you want, I can send you the final cleaned handler.ts block with this fix merged in.

bold karmaBOT
obtuse dirgeBOT
#

Yep, then just strip this prefix:

[WhatsApp +919111292909 +22m Mon 2026-04-20 13:13 GMT+5:30]

Use this:

function normalizeCaption(body: string | undefined): string {
  const text = (body ?? "").trim();
  if (!text) return "";
  return text.replace(/^\[WhatsApp[^\]]*\]\s*/i, "").trim();
}

Then:

#
function safeNameFromBody(body: string | undefined): string {
  const text = normalizeCaption(body);
  if (!text) return "Untitled product";
  if (text.startsWith("<media:")) return "Untitled product";
  if (text === "[Image]") return "Untitled product";
  return text;
}

So this:

[WhatsApp +919111292909 +22m Mon 2026-04-20 13:13 GMT+5:30] Dual heart enamel with pearl earrings

becomes:

#
Dual heart enamel with pearl earrings

If you want the full final handler.ts, I can paste the cleaned version.

#

Yep, I’m here. πŸ™‚

For your case, the fix is just this prefix-strip:

function normalizeCaption(body: string | undefined): string {
  const text = (body ?? "").trim();
  if (!text) return "";
  return text.replace(/^\[WhatsApp[^\]]*\]\s*/i, "").trim();
}

That will turn:

[WhatsApp +919111292909 +22m Mon 2026-04-20 13:13 GMT+5:30] Dual heart enamel with pearl earrings
#

into:

Dual heart enamel with pearl earrings

If you want, I can paste the final full handler.ts now.

tame notch
#

working now thanks

obtuse dirgeBOT
#

Nice, glad it’s working now 🦐

If this fixed it, please right-click the solution message β†’ Apps β†’ βœ… Mark Solution so Answer Overflow can index it for the next person.

If you want later, I can still help you clean up the final handler.ts and the reply-suppression plugin so the flow is production-ready.