#custom skill

1 messages · Page 1 of 1 (latest)

winged briar
#

i am on version 2026.4.20; I want to create a custom skill in OpenClaw for a repetitive operational task — upgrading OpenClaw itself (pull latest image, rebuild container, pin source clone, verify plugins and cron jobs). Two consumers will use it: me via Claude Code interactively, and a Ditto agent (HostOps role) autonomously later.
A few things I'd like to understand before I draft it:

What is the structure of a custom skill — is it a markdown file, a config block in openclaw.json, or something else? Where does it live in the workspace?
Can a skill include both a checklist/procedure (steps to follow) and verification criteria (what a successful upgrade looks like)?
Can an agent reference a skill explicitly by name in its instructions, or does the skill get injected automatically based on context?
Are there any constraints I should know about — file size, frontmatter format, tool access declarations, etc.?

robust pagodaBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> âś… Mark Solution

pseudo shaleBOT
sonic deltaBOT
#

Let me verify the docs path and the current skill docs so I don't make up any keys or formats.

#

I’ve got the main docs. One more pass through the skill reference and maybe an example skill, then I’ll answer directly.

#

Short version: it’s a folder with a SKILL.md file, not primarily an openclaw.json block.

  • Structure / location

    • Required: SKILL.md
    • Optional: scripts/, references/, assets/
    • Default workspace path: ~/.openclaw/workspace/skills/<skill>/SKILL.md
    • If you want it shared across agents on the same machine, ~/.openclaw/skills/<skill>/SKILL.md is usually the better home.
    • openclaw.json is for loading/visibility/env overrides, not the main skill content.
  • Checklist + verification

    • Yes, absolutely. That’s a good fit.
    • For your upgrade skill, I’d explicitly include:
      1. procedure
      2. verification / success criteria
      3. failure / rollback behavior
      4. report format
    • For fragile ops work, I’d put the exact deterministic commands in scripts/ and keep SKILL.md as the runbook/orchestrator.
#
  • How agents use it

    • Skills are injected automatically when they’re eligible and visible to that agent.
    • You can also make visibility explicit with:
      • agents.defaults.skills
      • agents.list[].skills
    • So for Ditto/HostOps, I’d allowlist the skill by name there.
    • If you’re using Claude Code through OpenClaw, OpenClaw passes the eligible skill set through to Claude as well.
    • If a skill has disable-model-invocation: true, it won’t be auto-injected into the model prompt.
  • Constraints / gotchas

    • Required frontmatter: name, description
    • Use a simple lowercase id, like openclaw-upgrade
    • Frontmatter keys should be single-line
    • metadata should be a single-line JSON object
    • There is no separate per-skill permission manifest for core tools. Skills mostly teach the model what to do. Actual tool access still comes from the agent/runtime/tool allowlists.
    • There is a loader size limit for SKILL.md (default 256 KB), but I’d keep it much smaller. The bundled skill guidance recommends keeping SKILL.md lean and moving bulky details into references/.
#

Minimal skeleton:

---
name: openclaw-upgrade
description: Upgrade OpenClaw on this host, verify plugins and cron jobs, and report success/failure.
metadata: {"openclaw":{"os":["linux"],"requires":{"bins":["docker","git","openclaw"]}}}
user-invocable: true
---

# OpenClaw Upgrade

## When to use
Use for routine OpenClaw upgrades on this host.

## Procedure
#
1. Pull the target image/tag.
2. Rebuild or restart the container.
3. Pin/update the source clone.
4. Verify plugin state.
5. Verify cron jobs.
6. Summarize result.

## Success criteria
- target image/tag is present
- container is healthy
- expected source revision is pinned
- expected plugins are enabled
- expected cron jobs still exist and are healthy

## Failure behavior
winged briar
#

yes...

sonic deltaBOT
#

Yep, here’s a solid first pass.

Recommended location for your use case

  • If both you and Ditto should see the same skill on the same machine, put it in:
    ~/.openclaw/skills/openclaw-upgrade/SKILL.md
  • Then allowlist it for your HostOps agent if you want it explicit.

Suggested layout

~/.openclaw/skills/openclaw-upgrade/
├── SKILL.md
├── scripts/
│   ├── run-upgrade.sh
│   └── verify-upgrade.sh
└── references/
    └── expected-state.md
#

SKILL.md

---
name: openclaw-upgrade
description: Upgrade an OpenClaw deployment on this host, pin the source clone, verify plugins and cron jobs, and report success or failure. Use for repetitive OpenClaw host operations and post-upgrade validation.
metadata: {"openclaw":{"os":["linux"],"requires":{"bins":["docker","git","openclaw"]}}}
user-invocable: true
---

# OpenClaw Upgrade

Use this skill when the task is to upgrade OpenClaw on a managed host and validate that the system is still healthy afterward.

Prefer deterministic scripts in `{baseDir}/scripts/` when they exist instead of re-deriving shell commands each time.

## Inputs to confirm
#
Before changing anything, identify:
- target image tag, digest, or version
- target source branch / commit to pin
- container or compose service name
- expected plugin set
- expected cron jobs
- acceptable maintenance window or downtime expectation

If the user did not specify the target version, ask once before proceeding.

## Procedure
1. Record current state before changes.
   - current running image / container state
   - current OpenClaw version
   - current source clone branch and commit
#
   - current enabled plugins
   - current cron job inventory

2. Pull the requested OpenClaw image or fetch the requested source revision.

3. Rebuild or recreate the container using the local deployment method.
   - prefer the repo's documented local deployment command for this host
   - do not improvise a different deployment flow unless the existing one is broken

4. Pin the source clone to the requested branch / commit.
   - ensure the pinned revision is reported back in the final summary

5. Wait for the service to become healthy.

6. Run post-upgrade verification.
#

## Verification criteria
Do not declare success unless all of these pass:
- the intended image tag or digest is present
- the running container matches the intended deployment
- `openclaw status` succeeds
- the expected OpenClaw version is reported
- the source clone is pinned to the expected branch / commit
- required plugins are still present and enabled
- expected cron jobs still exist
- cron jobs do not show obvious failures or missing dependencies
- any configured gateway or agent health checks return healthy results

## Failure handling
If any verification fails:
#
1. stop and report the first failing check clearly
2. include relevant diagnostics
3. say whether the failure happened during pull, rebuild, startup, plugin validation, or cron validation
4. do not claim the upgrade succeeded partially
5. if a rollback procedure exists in local references, follow it only when explicitly instructed or when the runbook says rollback is automatic

## Output format
Return a concise ops summary with:
- target version
- resulting version
- image tag / digest
- pinned source commit
- plugin verification result
- cron verification result
- final status: success or failed
#
- next action if failed

Optional HostOps agent allowlist

{
  agents: {
    list: [
      {
        id: "hostops",
        skills: ["openclaw-upgrade"]
      }
    ]
  }
}
#

My recommendation
For this kind of operational task, keep the policy + checklist in SKILL.md, but move the exact commands into scripts/run-upgrade.sh and scripts/verify-upgrade.sh. That makes autonomous runs much less flaky.

If you want, next I can turn this into a Docker Compose flavored version with concrete placeholders for:

  • image pull
  • container rebuild
  • source pinning
  • plugin checks
  • cron checks
winged briar
#

yes

sonic deltaBOT
#

I’m going to make that Compose-flavored version concrete and verify the plugin/cron CLI names first, so I don’t hand you fake commands.

#

Nice. Here’s a Docker Compose flavored version with the OpenClaw checks using verified CLI surfaces.

Suggested layout

~/.openclaw/skills/openclaw-upgrade/
├── SKILL.md
└── references/
    └── expected-state.md

SKILL.md

---
name: openclaw-upgrade
description: Upgrade an OpenClaw Docker Compose deployment, pin the source clone, verify plugin and cron health, and report the final state. Use for repeatable HostOps-style OpenClaw upgrades.
metadata: {"openclaw":{"os":["linux"],"requires":{"bins":["docker","git","openclaw"]}}}
#
user-invocable: true
---

# OpenClaw Upgrade (Docker Compose)

Use this skill when the task is to upgrade an OpenClaw deployment managed with Docker Compose.

Read `{baseDir}/references/expected-state.md` before making changes.

## Inputs to confirm
Collect or confirm:
- compose project directory
- compose service name for OpenClaw
- source clone directory
- target image tag or digest, if image-based
#
- target git ref, branch, tag, or commit
- expected plugin IDs
- expected cron job IDs

If the target version or ref is missing, ask once before proceeding.

## Preflight
Before changing anything:
1. Record current git branch and commit for the source clone.
2. Record current container image and running container state.
3. Record current OpenClaw version with `openclaw --version`.
4. Record current health with:
   - `openclaw status`
   - `openclaw gateway status`
5. Record plugin inventory with:
#
   - `openclaw plugins list --json`
6. Record cron inventory with:
   - `openclaw cron status`
   - `openclaw cron list`

## Upgrade procedure
1. Change to the compose project directory.
2. Inspect the Compose service definition.
   - If it uses `image:`, pull the requested image and recreate the service.
   - If it uses `build:`, pin the source clone first, then rebuild and recreate the service.
3. Pin the source clone to the requested ref.
   - Prefer an exact commit when one is provided.
   - Report the final resolved commit SHA.
4. Apply the Compose update.
5. Wait until the service is running again.
#
6. Run post-upgrade verification.

## Compose execution rules
For image-based services:
- use `docker compose pull <service>`
- then `docker compose up -d --force-recreate <service>`

For build-based services:
- update the source clone first
- then use `docker compose up -d --build <service>`

Do not modify unrelated services unless the Compose stack requires it.

## Verification criteria
Do not declare success unless all of these pass:
#
1. `openclaw --version` returns successfully.
2. `openclaw status` returns successfully.
3. `openclaw gateway status` returns successfully.
4. `openclaw plugins list --json` shows the expected plugins from `expected-state.md`.
5. `openclaw cron status` returns successfully.
6. `openclaw cron list` still contains the expected cron jobs from `expected-state.md`.
7. The source clone is pinned to the requested ref or resolved commit.
8. The running Compose service matches the intended deployment method and target image or build result.

## Cron validation
For each expected cron job:
- confirm the job still exists in `openclaw cron list`
- if a known job ID is available, inspect recent history with:
  - `openclaw cron runs --id <job-id> --limit 20`
#
If cron job history shows obvious breakage after the upgrade, treat the upgrade as failed verification.

## Plugin validation
Compare the enabled or discovered plugin inventory against the expected plugin IDs in `expected-state.md`.

If a required plugin is missing or unexpectedly disabled, treat the upgrade as failed verification.

## Failure handling
If any verification step fails:
1. stop and identify the first failed check
2. include relevant command output
3. state whether failure happened during pull, build, recreate, startup, plugin validation, or cron validation
4. do not claim success
5. suggest the smallest safe next action
#
## Final report format
Return:
- target image tag or digest
- target git ref
- resulting git commit
- resulting OpenClaw version
- plugin verification result
- cron verification result
- final status: success or failed
- next action, only if failed

references/expected-state.md

# Expected State
#

## Compose
- project_dir: /opt/openclaw
- service_name: openclaw

## Source
- repo_dir: /opt/openclaw-src
- default_ref: main

## Required plugins
- browser
- openai-codex
- discord

## Required cron jobs
#
- daily-healthcheck
- nightly-summary

Good pattern for your case

I’d keep:

  • SKILL.md = workflow, safety rules, success criteria
  • expected-state.md = host-specific facts
  • optional later: a tiny scripts/verify.sh if you want super deterministic checks

If you want Ditto/HostOps to always see it

{
  agents: {
    list: [
#
      {
        id: "hostops",
        skills: ["openclaw-upgrade"]
      }
    ]
  }
}

If you want, I can do one more pass and make this fully concrete for your actual layout if you paste:

  • compose project dir
  • service name
  • repo dir
  • expected plugin IDs
  • expected cron job IDs
winged briar
#

I'm building an openclaw-upgrade skill with a scripts/upgrade-driver.py entry point that calls verify.py for post-upgrade validation. A few things I want to confirm before writing it:

When a skill references a script, how does the agent invoke it — does it use the exec tool, or is there a dedicated skill execution mechanism?
Inside verify.py, when calling openclaw plugins list, openclaw cron list etc. — does the script run inside the container (so CLI is natively available), or from the host (requiring docker exec)? What's the right way to call OpenClaw CLI from a skill script?
Does the os field in skill metadata affect visibility or injection — and what should it be for a Windows host running a Linux Docker container?
Are skills placed in ~/.openclaw/skills/ auto-discovered, or do they need an entry in openclaw.json?

sonic deltaBOT
#

I’m checking the exact runtime behavior for skill scripts and OS gating so I don’t steer you wrong.

#

Yep, a few important clarifications here:

  1. How does a skill script get invoked?
    There is no separate “skill runtime” for scripts/*.py. In practice, the agent runs them with the exec tool by following the instructions in SKILL.md, usually with a command like:

    python {baseDir}/scripts/upgrade-driver.py ...
    

    The one exception is if you make the skill a slash command with command-dispatch: tool, but that dispatches to a tool, not directly to a script.

  2. Does verify.py run on the host or inside a container?
    It runs wherever the agent’s exec environment is:

    • Not sandboxed → script runs on the host
    • Sandboxed → script runs in the sandbox container
    • It does not automatically run inside your OpenClaw app container
#

So for openclaw plugins list, openclaw cron list, etc. the right approach is:

  • If the host has a working openclaw CLI that can reach the target gateway, call openclaw ... directly from the script
  • If the CLI only exists inside your deployment container, then your script should explicitly use docker exec <container> openclaw ...

In other words, the skill script does not magically inherit the app container context.

For a HostOps skill, I’d usually keep this explicit:

  • host-level checks via openclaw ... when available
  • container-level checks via docker exec ... openclaw ... when that’s the real source of truth
  1. What does metadata.openclaw.os actually gate?
    It affects eligibility / injection visibility. It is checked against the runtime host OS, not the OS inside your managed Docker container.

    So:

    • Windows host running Linux containers → host OS is win32
    • Linux host running Docker → linux
#
  • macOS host → darwin

    For your case, if the Gateway/exec runtime is on Windows but it manages a Linux Docker container, os: ["linux"] would be wrong.
    I’d either:

    • set it to the actual host runtime OS, or
    • omit os entirely unless you really need gating

    That’s honestly my recommendation here. Omit os unless the skill is truly platform-specific.

  1. Are ~/.openclaw/skills/ skills auto-discovered?
    Yes. ~/.openclaw/skills/ is a built-in skill root and is auto-discovered. You do not need an openclaw.json entry just to load a skill from there.

    You only need config if you want things like:

    • skills.load.extraDirs
    • skills.entries.<name>.*
    • agent allowlists via agents.defaults.skills / agents.list[].skills
#

Practical recommendation for your skill

For openclaw-upgrade, I’d structure it like this:

  • SKILL.md tells the agent to run:
    • python {baseDir}/scripts/upgrade-driver.py ...
  • upgrade-driver.py decides whether verification should use:
    • direct host CLI calls, or
    • docker exec <container> openclaw ...
  • verify.py should not assume it is already inside the OpenClaw container

Also, for gating:

  • use requires.bins: ["python3", "docker"]
  • only add openclaw to requires.bins if you truly require the host CLI to exist

That last bit matters a lot. If your plan is docker exec ... openclaw ..., then gating on host openclaw would just hide the skill unnecessarily.

winged briar
#

ys

sonic deltaBOT
#

Yep, here’s a clean script-driven SKILL.md that supports both host CLI and docker exec modes:

---
name: openclaw-upgrade
description: Upgrade an OpenClaw deployment using a script-driven workflow, then verify plugin and cron health. Use when updating the deployed image or source clone, recreating the service, and validating post-upgrade state.
metadata: {"openclaw":{"requires":{"bins":["python3","docker"]}}}
user-invocable: true
---

# OpenClaw Upgrade

Use this skill for repeatable OpenClaw upgrade operations where deterministic scripts are preferred over ad hoc shell steps.

The primary entry point is:

```bash
#

python3 {baseDir}/scripts/upgrade-driver.py
--mode auto
--compose-dir <compose-dir>
--service <compose-service>
--repo-dir <repo-dir>
--target-ref <git-ref-or-tag-or-commit>
[--container <container-name>]
[--expected-plugins <csv-or-file>]
[--expected-crons <csv-or-file>]


`upgrade-driver.py` is responsible for calling `verify.py` after the upgrade completes.

## Execution model

There is no dedicated skill runtime for scripts. Run the script via normal command execution.
#

Prefer the script entry point over re-deriving the upgrade procedure manually unless the user explicitly asks for a manual run.

## Verification mode selection

`upgrade-driver.py` and `verify.py` must support these verification modes:

- `host-cli`
  - use when the host environment has a working `openclaw` CLI and it can reach the target gateway
  - call commands like:
    - `openclaw --version`
    - `openclaw status`
    - `openclaw gateway status`
    - `openclaw plugins list --json`
    - `openclaw cron status`
#
    - `openclaw cron list`

- `docker-exec`
  - use when the target OpenClaw CLI is available inside the deployment container instead of the host
  - call commands like:
    - `docker exec <container> openclaw --version`
    - `docker exec <container> openclaw status`
    - `docker exec <container> openclaw gateway status`
    - `docker exec <container> openclaw plugins list --json`
    - `docker exec <container> openclaw cron status`
    - `docker exec <container> openclaw cron list`

- `auto`
  - prefer `host-cli` when the host has a working `openclaw` CLI for the target deployment
  - otherwise fall back to `docker-exec`
#

Do not assume the script is already running inside the OpenClaw app container.

## Inputs to gather before running

Confirm or infer:
- compose project directory
- compose service name
- source repository directory
- target git ref or image tag
- deployment container name if `docker-exec` mode is needed
- expected plugin IDs
- expected cron job IDs

Ask once if any critical target details are missing.
#

## Required behavior

1. Record pre-upgrade state.
2. Perform the upgrade using the deployment’s existing Docker Compose workflow.
3. Pin the source clone to the requested ref.
4. Recreate or rebuild the target service as required.
5. Call `verify.py` for post-upgrade validation.
6. Return a concise final report.

## Verification requirements

Do not declare success unless all of the following pass:
- OpenClaw CLI responds successfully in the selected verification mode
- reported OpenClaw version is readable
#
- gateway status is healthy
- expected plugins are present
- cron subsystem is healthy
- expected cron jobs still exist
- source clone is pinned to the requested ref or resolved commit
- the running deployment matches the intended target state

## Failure handling

If verification fails:
- stop at the first failing validation group
- report whether failure occurred in upgrade or verification
- include the exact failing command and concise output
- do not claim partial success as success
- recommend the smallest safe next action
#

## Final report format

Return:
- verification mode used
- compose dir and service
- repo dir
- target ref
- resolved commit
- resulting OpenClaw version
- plugin check result
- cron check result
- final status: success or failed

A couple of practical notes:

#
  • I left out openclaw from requires.bins on purpose. That keeps the skill visible even if you plan to verify via docker exec ... openclaw ....
  • If this skill is host CLI only, then add openclaw to requires.bins.
  • For Windows host + Linux Docker container, I’d still omit os unless you really need gating.

If you want, I can do the next piece too: a matching CLI contract for:

  • scripts/upgrade-driver.py
  • scripts/verify.py

with concrete flags and return codes.

winged briar
#

yes

sonic deltaBOT
#

Perfect. Here’s a good CLI contract for the two scripts.

upgrade-driver.py

Suggested flags

python3 upgrade-driver.py \
  --mode auto|host-cli|docker-exec \
  --compose-dir /path/to/compose \
  --service openclaw \
  --repo-dir /path/to/repo \
  --target-ref main|v2026.4.20|<commit> \
  [--container openclaw] \
  [--compose-file docker-compose.yml] \
  [--expected-plugins browser,discord,openai-codex] \
  [--expected-crons nightly-summary,daily-healthcheck] \
#
  [--json]

Responsibilities

  1. validate inputs
  2. resolve verification mode
  3. capture pre-upgrade state
  4. pin repo to target ref
  5. run the Docker Compose upgrade flow
  6. call verify.py
  7. emit one final result object
  8. exit non-zero on upgrade or verification failure

Recommended behavior

  • auto
    • use host-cli if host openclaw works for the target deployment
#
  • otherwise require --container and use docker-exec
  • if Compose service uses image:
    • docker compose pull <service>
    • docker compose up -d --force-recreate <service>
  • if Compose service uses build:
    • pin repo first
    • docker compose up -d --build <service>

verify.py

Suggested flags

python3 verify.py \
  --mode host-cli|docker-exec \
#
  [--container openclaw] \
  [--expected-plugins browser,discord,openai-codex] \
  [--expected-crons nightly-summary,daily-healthcheck] \
  [--json]

Responsibilities

  1. run OpenClaw health commands in the selected mode
  2. collect:
    • openclaw --version
    • openclaw status
    • openclaw gateway status
    • openclaw plugins list --json
    • openclaw cron status
    • openclaw cron list
  3. compare actual plugins/crons with expected sets
#
  1. emit a structured verification result
  2. exit non-zero on verification failure

Output contract

I’d make both scripts print JSON to stdout when --json is set.

verify.py output

{
  "ok": true,
  "mode": "docker-exec",
  "version": "2026.4.20",
  "plugins": {
#
    "ok": true,
    "expected": ["browser", "discord", "openai-codex"],
    "missing": []
  },
  "cron": {
    "ok": true,
    "expected": ["nightly-summary", "daily-healthcheck"],
    "missing": []
  },
  "checks": {
    "status": true,
    "gatewayStatus": true,
    "pluginsList": true,
    "cronStatus": true,
    "cronList": true
#
  }
}

upgrade-driver.py output

{
  "ok": true,
  "phase": "complete",
  "mode": "docker-exec",
  "targetRef": "main",
  "resolvedCommit": "abc1234",
  "composeDir": "/opt/openclaw",
  "service": "openclaw",
  "verification": {
#
    "ok": true
  }
}

Exit codes

I’d keep them simple:

0  success
10 bad input / missing args
20 preflight failure
#
30 upgrade failure
40 verification failure
50 internal script error

And specifically:

  • if verify.py fails validation, return 40
  • if upgrade-driver.py calls verify.py and it fails, propagate 40

One practical recommendation

Make upgrade-driver.py the only script the skill calls.
Let it invoke verify.py internally.

#

That gives you:

  • one stable skill entry point
  • cleaner agent instructions
  • one JSON result for both Claude Code and Ditto

If you want, I can also sketch a small argparse skeleton for both files so you can paste it directly into scripts/upgrade-driver.py and scripts/verify.py.

winged briar
#

sure.. that would be useful

sonic deltaBOT
#

I’m going to sanity-check whether openclaw cron list has JSON output before I hand you a Python stub that depends on it.

#

Yep. I’d start with something like this.

One small tweak I’d add: --deploy-kind auto|image|build
That makes the upgrade path a lot less guessy.

scripts/upgrade-driver.py

#!/usr/bin/env python3
from __future__ import annotations

import argparse
import json
import subprocess
import sys
from pathlib import Path
#
EXIT_BAD_INPUT = 10
EXIT_PREFLIGHT = 20
EXIT_UPGRADE = 30
EXIT_VERIFY = 40
EXIT_INTERNAL = 50

BASE_DIR = Path(__file__).resolve().parent
VERIFY_PY = BASE_DIR / "verify.py"


def run(cmd: list[str], *, cwd: str | None = None, check: bool = True) -> subprocess.CompletedProcess[str]:
    proc = subprocess.run(cmd, cwd=cwd, text=True, capture_output=True)
    if check and proc.returncode != 0:
        raise RuntimeError(
            f"command failed ({proc.returncode}): {' '.join(cmd)}\nSTDOUT:\n{proc.stdout}\nSTDERR:\n{proc.stderr}"
#
        )
    return proc


def run_json(cmd: list[str], *, cwd: str | None = None) -> object:
    proc = run(cmd, cwd=cwd)
    return json.loads(proc.stdout)


def emit(payload: dict, as_json: bool) -> None:
    if as_json:
        print(json.dumps(payload, indent=2, sort_keys=True))
    else:
        print(payload)
#

def resolve_mode(args: argparse.Namespace) -> str:
    if args.mode != "auto":
        if args.mode == "docker-exec" and not args.container:
            raise SystemExit("--container is required for docker-exec mode")
        return args.mode

    # auto mode
    try:
        run(["openclaw", "--version"])
        run(["openclaw", "status", "--json"])
        return "host-cli"
    except Exception:
        if args.container:
            return "docker-exec"
#
        raise SystemExit("auto mode could not use host-cli and no --container was provided")


def resolve_deploy_kind(args: argparse.Namespace) -> str:
    if args.deploy_kind != "auto":
        return args.deploy_kind

    # Keep auto conservative. You can replace this with compose config inspection later.
    compose_text = run(
        ["docker", "compose", "-f", args.compose_file, "config"],
        cwd=args.compose_dir,
    ).stdout
    service_block_hint = f"{args.service}:"
    if service_block_hint in compose_text and "build:" in compose_text:
        return "build"
#
    return "image"


def pin_repo(repo_dir: str, target_ref: str) -> str:
    run(["git", "fetch", "--all", "--tags"], cwd=repo_dir)
    run(["git", "checkout", target_ref], cwd=repo_dir)
    commit = run(["git", "rev-parse", "HEAD"], cwd=repo_dir).stdout.strip()
    return commit


def upgrade_with_compose(args: argparse.Namespace, deploy_kind: str) -> None:
    if deploy_kind == "image":
        run(["docker", "compose", "-f", args.compose_file, "pull", args.service], cwd=args.compose_dir)
        run(
            ["docker", "compose", "-f", args.compose_file, "up", "-d", "--force-recreate", args.service],
#
            cwd=args.compose_dir,
        )
    elif deploy_kind == "build":
        run(
            ["docker", "compose", "-f", args.compose_file, "up", "-d", "--build", args.service],
            cwd=args.compose_dir,
        )
    else:
        raise RuntimeError(f"unknown deploy kind: {deploy_kind}")


def call_verify(args: argparse.Namespace, mode: str) -> dict:
    cmd = [
        sys.executable,
        str(VERIFY_PY),
#
        "--mode", mode,
        "--expected-plugins", args.expected_plugins or "",
        "--expected-crons", args.expected_crons or "",
        "--json",
    ]
    if args.container:
        cmd += ["--container", args.container]

    proc = subprocess.run(cmd, text=True, capture_output=True)
    if proc.returncode != 0:
        try:
            payload = json.loads(proc.stdout)
        except Exception:
            payload = {
                "ok": False,
#
                "error": "verify.py failed",
                "stdout": proc.stdout,
                "stderr": proc.stderr,
            }
        payload["_verify_exit_code"] = proc.returncode
        raise VerifyFailed(payload)

    return json.loads(proc.stdout)


class VerifyFailed(Exception):
    def __init__(self, payload: dict):
        super().__init__("verification failed")
        self.payload = payload
#

def parse_args() -> argparse.Namespace:
    p = argparse.ArgumentParser()
    p.add_argument("--mode", choices=["auto", "host-cli", "docker-exec"], default="auto")
    p.add_argument("--deploy-kind", choices=["auto", "image", "build"], default="auto")
    p.add_argument("--compose-dir", required=True)
    p.add_argument("--compose-file", default="docker-compose.yml")
    p.add_argument("--service", required=True)
    p.add_argument("--repo-dir", required=True)
    p.add_argument("--target-ref", required=True)
    p.add_argument("--container")
    p.add_argument("--expected-plugins")
    p.add_argument("--expected-crons")
    p.add_argument("--json", action="store_true")
    return p.parse_args()
#


def main() -> int:
    args = parse_args()

    result: dict = {
        "ok": False,
        "phase": "starting",
        "targetRef": args.target_ref,
        "composeDir": args.compose_dir,
        "composeFile": args.compose_file,
        "service": args.service,
        "repoDir": args.repo_dir,
    }
#
    try:
        mode = resolve_mode(args)
        deploy_kind = resolve_deploy_kind(args)
        result["mode"] = mode
        result["deployKind"] = deploy_kind

        result["phase"] = "pinning-repo"
        resolved_commit = pin_repo(args.repo_dir, args.target_ref)
        result["resolvedCommit"] = resolved_commit

        result["phase"] = "upgrading"
        upgrade_with_compose(args, deploy_kind)

        result["phase"] = "verifying"
        verification = call_verify(args, mode)
#
        result["verification"] = verification

        result["ok"] = bool(verification.get("ok"))
        result["phase"] = "complete"
        emit(result, args.json)
        return 0 if result["ok"] else EXIT_VERIFY

    except SystemExit as e:
        result["phase"] = "bad-input"
        result["error"] = str(e)
        emit(result, args.json)
        return EXIT_BAD_INPUT
    except VerifyFailed as e:
        result["phase"] = "verification"
        result["verification"] = e.payload
#
        emit(result, args.json)
        return EXIT_VERIFY
    except RuntimeError as e:
        result["phase"] = result.get("phase", "runtime-error")
        result["error"] = str(e)
        emit(result, args.json)
        return EXIT_UPGRADE if result["phase"] in {"upgrading", "pinning-repo"} else EXIT_PREFLIGHT
    except Exception as e:
        result["phase"] = "internal-error"
        result["error"] = repr(e)
        emit(result, args.json)
        return EXIT_INTERNAL


if __name__ == "__main__":
#
    raise SystemExit(main())

scripts/verify.py

#!/usr/bin/env python3
from __future__ import annotations

import argparse
import json
import subprocess
import sys
from typing import Any

EXIT_VERIFY = 40
#
EXIT_INTERNAL = 50


def run(cmd: list[str], *, check: bool = True) -> subprocess.CompletedProcess[str]:
    proc = subprocess.run(cmd, text=True, capture_output=True)
    if check and proc.returncode != 0:
        raise RuntimeError(
            f"command failed ({proc.returncode}): {' '.join(cmd)}\nSTDOUT:\n{proc.stdout}\nSTDERR:\n{proc.stderr}"
        )
    return proc


def run_json(cmd: list[str]) -> Any:
    return json.loads(run(cmd).stdout)
#

def emit(payload: dict, as_json: bool) -> None:
    if as_json:
        print(json.dumps(payload, indent=2, sort_keys=True))
    else:
        print(payload)


def csv_set(raw: str | None) -> set[str]:
    if not raw:
        return set()
    return {item.strip() for item in raw.split(",") if item.strip()}


def openclaw_prefix(mode: str, container: str | None) -> list[str]:
#
    if mode == "host-cli":
        return ["openclaw"]
    if mode == "docker-exec":
        if not container:
            raise SystemExit("--container is required for docker-exec mode")
        return ["docker", "exec", container, "openclaw"]
    raise SystemExit(f"unsupported mode: {mode}")


def normalize_plugin_ids(payload: Any) -> set[str]:
    # Adjust this once you've seen your local JSON shape.
    if isinstance(payload, dict):
        if "plugins" in payload:
            return normalize_plugin_ids(payload["plugins"])
        return {
winged briar
sonic deltaBOT
#

Yep, here’s a complete verify.py skeleton you can drop in as a starting point:

#!/usr/bin/env python3
from __future__ import annotations

import argparse
import json
import subprocess
import sys
from typing import Any

EXIT_VERIFY = 40
EXIT_INTERNAL = 50

#
def run(cmd: list[str], *, check: bool = True) -> subprocess.CompletedProcess[str]:
    proc = subprocess.run(cmd, text=True, capture_output=True)
    if check and proc.returncode != 0:
        raise RuntimeError(
            f"command failed ({proc.returncode}): {' '.join(cmd)}\n"
            f"STDOUT:\n{proc.stdout}\n"
            f"STDERR:\n{proc.stderr}"
        )
    return proc


def run_json(cmd: list[str]) -> Any:
    proc = run(cmd)
    try:
        return json.loads(proc.stdout)
#
    except json.JSONDecodeError as e:
        raise RuntimeError(
            f"expected JSON from command but could not parse it: {' '.join(cmd)}\n"
            f"STDOUT:\n{proc.stdout}\n"
            f"STDERR:\n{proc.stderr}"
        ) from e


def emit(payload: dict[str, Any], as_json: bool) -> None:
    if as_json:
        print(json.dumps(payload, indent=2, sort_keys=True))
    else:
        print(payload)

#
def csv_set(raw: str | None) -> set[str]:
    if not raw:
        return set()
    return {item.strip() for item in raw.split(",") if item.strip()}


def openclaw_prefix(mode: str, container: str | None) -> list[str]:
    if mode == "host-cli":
        return ["openclaw"]
    if mode == "docker-exec":
        if not container:
            raise SystemExit("--container is required for docker-exec mode")
        return ["docker", "exec", container, "openclaw"]
    raise SystemExit(f"unsupported mode: {mode}")
#

def _extract_named_items(payload: Any) -> set[str]:
    """
    Heuristic extractor for IDs/names from arbitrary JSON payloads.
    You will probably tweak this once you see your exact local JSON shape.
    """
    found: set[str] = set()

    if isinstance(payload, dict):
        for key, value in payload.items():
            if key in {"id", "name", "pluginId", "jobId"} and isinstance(value, str) and value.strip():
                found.add(value.strip())
            found |= _extract_named_items(value)

    elif isinstance(payload, list):
#
        for item in payload:
            found |= _extract_named_items(item)

    return found


def normalize_plugin_ids(payload: Any) -> set[str]:
    """
    Try a few common shapes first, then fall back to recursive extraction.
    """
    if isinstance(payload, dict):
        if "plugins" in payload:
            return normalize_plugin_ids(payload["plugins"])
        if "items" in payload:
            return normalize_plugin_ids(payload["items"])
#

    if isinstance(payload, list):
        results: set[str] = set()
        for item in payload:
            if isinstance(item, dict):
                candidate = item.get("id") or item.get("name")
                if isinstance(candidate, str) and candidate.strip():
                    results.add(candidate.strip())
        if results:
            return results

    return _extract_named_items(payload)


def normalize_cron_ids(payload: Any) -> set[str]:
#
    """
    Try a few common shapes first, then fall back to recursive extraction.
    """
    if isinstance(payload, dict):
        if "jobs" in payload:
            return normalize_cron_ids(payload["jobs"])
        if "items" in payload:
            return normalize_cron_ids(payload["items"])

    if isinstance(payload, list):
        results: set[str] = set()
        for item in payload:
            if isinstance(item, dict):
                candidate = item.get("id") or item.get("name") or item.get("jobId")
                if isinstance(candidate, str) and candidate.strip():
#
                    results.add(candidate.strip())
        if results:
            return results

    return _extract_named_items(payload)


def parse_args() -> argparse.Namespace:
    p = argparse.ArgumentParser()
    p.add_argument("--mode", choices=["host-cli", "docker-exec"], required=True)
    p.add_argument("--container")
    p.add_argument("--expected-plugins")
    p.add_argument("--expected-crons")
    p.add_argument("--json", action="store_true")
    return p.parse_args()
#


def main() -> int:
    args = parse_args()

    expected_plugins = csv_set(args.expected_plugins)
    expected_crons = csv_set(args.expected_crons)

    result: dict[str, Any] = {
        "ok": False,
        "mode": args.mode,
    }

    try:
        oc = openclaw_prefix(args.mode, args.container)
#

        version = run(oc + ["--version"]).stdout.strip()
        status_payload = run_json(oc + ["status", "--json"])
        gateway_status_payload = run_json(oc + ["gateway", "status", "--json"])
        plugins_payload = run_json(oc + ["plugins", "list", "--json"])
        cron_status_payload = run_json(oc + ["cron", "status", "--json"])
        cron_list_payload = run_json(oc + ["cron", "list", "--json"])

        actual_plugins = normalize_plugin_ids(plugins_payload)
        actual_crons = normalize_cron_ids(cron_list_payload)

        missing_plugins = sorted(expected_plugins - actual_plugins)
        missing_crons = sorted(expected_crons - actual_crons)

        plugins_ok = not missing_plugins
#
        crons_ok = not missing_crons

        result.update(
            {
                "version": version,
                "checks": {
                    "status": True,
                    "gatewayStatus": True,
                    "pluginsList": True,
                    "cronStatus": True,
                    "cronList": True,
                },
                "plugins": {
                    "ok": plugins_ok,
                    "expected": sorted(expected_plugins),
#
                    "actual": sorted(actual_plugins),
                    "missing": missing_plugins,
                },
                "cron": {
                    "ok": crons_ok,
                    "expected": sorted(expected_crons),
                    "actual": sorted(actual_crons),
                    "missing": missing_crons,
                },
                "raw": {
                    "status": status_payload,
                    "gatewayStatus": gateway_status_payload,
                    "plugins": plugins_payload,
                    "cronStatus": cron_status_payload,
                    "cronList": cron_list_payload,
#
                },
            }
        )

        result["ok"] = plugins_ok and crons_ok
        emit(result, args.json)
        return 0 if result["ok"] else EXIT_VERIFY

    except SystemExit:
        raise
    except Exception as e:
        result["error"] = repr(e)
        emit(result, args.json)
        return EXIT_INTERNAL
#

if __name__ == "__main__":
    raise SystemExit(main())

Two quick notes:

  • the normalization helpers are intentionally heuristic, since your local --json shapes may differ a bit

  • once you run:

    • openclaw plugins list --json
    • openclaw cron list --json

    you’ll probably tighten normalize_plugin_ids() and normalize_cron_ids() to match the real schema exactly

If you want, I can also send a cleaned-up final bundle with:

  • SKILL.md
#
  • upgrade-driver.py
  • verify.py

all aligned to the same flags so you can paste the whole thing in one shot.

winged briar
#

yes

sonic deltaBOT
#

Yep, here’s a paste-ready bundle.

Layout

~/.openclaw/skills/openclaw-upgrade/
├── SKILL.md
└── scripts/
    ├── upgrade-driver.py
    └── verify.py

SKILL.md

---
name: openclaw-upgrade
description: Upgrade an OpenClaw deployment with Docker Compose, pin the source clone, and verify post-upgrade plugin and cron health using deterministic scripts.
#
metadata: {"openclaw":{"requires":{"bins":["python3","docker"]}}}
user-invocable: true
---

# OpenClaw Upgrade

Use this skill for repeatable OpenClaw upgrade operations where deterministic scripts are preferred over ad hoc shell steps.

Primary entry point:

```bash
python3 {baseDir}/scripts/upgrade-driver.py \
  --mode auto \
  --deploy-kind auto \
  --compose-dir <compose-dir> \
  --compose-file docker-compose.yml \
#

--service <compose-service>
--repo-dir <repo-dir>
--target-ref <git-ref-or-tag-or-commit>
[--container <container-name>]
[--expected-plugins plugin-a,plugin-b]
[--expected-crons job-a,job-b]
--json


## Execution model

There is no dedicated skill runtime for scripts. Run the script via normal command execution.

Prefer the script entry point over re-deriving the upgrade manually unless the user explicitly asks for a manual run.

## Mode selection
#

The script supports:

- `host-cli`
  - use host `openclaw` CLI directly
- `docker-exec`
  - use `docker exec <container> openclaw ...`
- `auto`
  - prefer `host-cli` if the host `openclaw` CLI works for the target deployment
  - otherwise fall back to `docker-exec` if `--container` is provided

Do not assume the script is already running inside the OpenClaw app container.

## Deploy kind selection
#
The script supports:

- `image`
  - pull image, then recreate the service
- `build`
  - pin repo first, then rebuild/recreate the service
- `auto`
  - inspect compose config and choose conservatively

## Inputs to gather before running

Confirm or infer:
- compose project directory
- compose file
- compose service name
#
- source repository directory
- target git ref, tag, or commit
- deployment container name if `docker-exec` mode is needed
- expected plugin IDs
- expected cron job IDs

Ask once if any critical target details are missing.

## Required behavior

1. Record pre-upgrade state.
2. Pin the repo to the target ref.
3. Run the Docker Compose upgrade flow.
4. Call `verify.py`.
5. Return a concise final report.
#

## Verification requirements

Do not declare success unless all of the following pass:
- OpenClaw CLI responds successfully in the selected mode
- `openclaw --version` succeeds
- `openclaw status --json` succeeds
- `openclaw gateway status --json` succeeds
- `openclaw plugins list --json` succeeds
- `openclaw cron status --json` succeeds
- `openclaw cron list --json` succeeds
- expected plugins are present
- expected cron jobs are present
- source clone is pinned to the requested ref or resolved commit
#
## Failure handling

If verification fails:
- stop at the first failing validation group
- report whether failure occurred during upgrade or verification
- include the failing command and concise output
- do not claim partial success as success
- recommend the smallest safe next action

## Final report format

Return:
- mode used
- deploy kind used
- compose dir / file / service
#
- repo dir
- target ref
- resolved commit
- resulting OpenClaw version
- plugin check result
- cron check result
- final status: success or failed

scripts/upgrade-driver.py

#!/usr/bin/env python3
from __future__ import annotations

import argparse
#
import json
import subprocess
import sys
from pathlib import Path
from typing import Any

EXIT_BAD_INPUT = 10
EXIT_PREFLIGHT = 20
EXIT_UPGRADE = 30
EXIT_VERIFY = 40
EXIT_INTERNAL = 50

BASE_DIR = Path(__file__).resolve().parent
VERIFY_PY = BASE_DIR / "verify.py"
#

def run(cmd: list[str], *, cwd: str | None = None, check: bool = True) -> subprocess.CompletedProcess[str]:
    proc = subprocess.run(cmd, cwd=cwd, text=True, capture_output=True)
    if check and proc.returncode != 0:
        raise RuntimeError(
            f"command failed ({proc.returncode}): {' '.join(cmd)}\n"
            f"STDOUT:\n{proc.stdout}\n"
            f"STDERR:\n{proc.stderr}"
        )
    return proc


def run_json(cmd: list[str], *, cwd: str | None = None) -> Any:
    proc = run(cmd, cwd=cwd)
    try:
#
        return json.loads(proc.stdout)
    except json.JSONDecodeError as e:
        raise RuntimeError(
            f"expected JSON from command but could not parse it: {' '.join(cmd)}\n"
            f"STDOUT:\n{proc.stdout}\n"
            f"STDERR:\n{proc.stderr}"
        ) from e


def emit(payload: dict[str, Any], as_json: bool) -> None:
    if as_json:
        print(json.dumps(payload, indent=2, sort_keys=True))
    else:
        print(payload)
#

def resolve_mode(args: argparse.Namespace) -> str:
    if args.mode != "auto":
        if args.mode == "docker-exec" and not args.container:
            raise SystemExit("--container is required for docker-exec mode")
        return args.mode

    try:
        run(["openclaw", "--version"])
        run(["openclaw", "status", "--json"])
        return "host-cli"
    except Exception:
        if args.container:
            return "docker-exec"
        raise SystemExit("auto mode could not use host-cli and no --container was provided")
#


def resolve_deploy_kind(args: argparse.Namespace) -> str:
    if args.deploy_kind != "auto":
        return args.deploy_kind

    compose_text = run(
        ["docker", "compose", "-f", args.compose_file, "config"],
        cwd=args.compose_dir,
    ).stdout

    service_hint = f"{args.service}:"
    if service_hint in compose_text and "build:" in compose_text:
        return "build"
    return "image"
#


def pin_repo(repo_dir: str, target_ref: str) -> str:
    run(["git", "fetch", "--all", "--tags"], cwd=repo_dir)
    run(["git", "checkout", target_ref], cwd=repo_dir)
    return run(["git", "rev-parse", "HEAD"], cwd=repo_dir).stdout.strip()


def capture_pre_state(mode: str, container: str | None) -> dict[str, Any]:
    pre: dict[str, Any] = {}

    pre["gitVersion"] = run(["git", "--version"]).stdout.strip()
    pre["dockerVersion"] = run(["docker", "--version"]).stdout.strip()

    if mode == "host-cli":
#
        oc = ["openclaw"]
    else:
        oc = ["docker", "exec", container, "openclaw"]  # type: ignore[list-item]

    pre["openclawVersion"] = run(oc + ["--version"]).stdout.strip()
    pre["status"] = run_json(oc + ["status", "--json"])
    pre["gatewayStatus"] = run_json(oc + ["gateway", "status", "--json"])
    return pre


def upgrade_with_compose(args: argparse.Namespace, deploy_kind: str) -> None:
    if deploy_kind == "image":
        run(["docker", "compose", "-f", args.compose_file, "pull", args.service], cwd=args.compose_dir)
        run(
            ["docker", "compose", "-f", args.compose_file, "up", "-d", "--force-recreate", args.service],
#
            cwd=args.compose_dir,
        )
        return

    if deploy_kind == "build":
        run(
            ["docker", "compose", "-f", args.compose_file, "up", "-d", "--build", args.service],
            cwd=args.compose_dir,
        )
        return

    raise RuntimeError(f"unknown deploy kind: {deploy_kind}")


class VerifyFailed(Exception):
#
    def __init__(self, payload: dict[str, Any]):
        super().__init__("verification failed")
        self.payload = payload


def call_verify(args: argparse.Namespace, mode: str) -> dict[str, Any]:
    cmd = [
        sys.executable,
        str(VERIFY_PY),
        "--mode",
        mode,
        "--expected-plugins",
        args.expected_plugins or "",
        "--expected-crons",
        args.expected_crons or "",
#
        "--json",
    ]
    if args.container:
        cmd += ["--container", args.container]

    proc = subprocess.run(cmd, text=True, capture_output=True)

    try:
        payload = json.loads(proc.stdout) if proc.stdout.strip() else {
            "ok": False,
            "error": "verify.py produced no JSON output",
            "stdout": proc.stdout,
            "stderr": proc.stderr,
        }
    except json.JSONDecodeError:
#
        payload = {
            "ok": False,
            "error": "verify.py returned non-JSON output",
            "stdout": proc.stdout,
            "stderr": proc.stderr,
        }

    if proc.returncode != 0:
        payload["_verify_exit_code"] = proc.returncode
        raise VerifyFailed(payload)

    return payload


def parse_args() -> argparse.Namespace:
#
    p = argparse.ArgumentParser()
    p.add_argument("--mode", choices=["auto", "host-cli", "docker-exec"], default="auto")
    p.add_argument("--deploy-kind", choices=["auto", "image", "build"], default="auto")
    p.add_argument("--compose-dir", required=True)
    p.add_argument("--compose-file", default="docker-compose.yml")
    p.add_argument("--service", required=True)
    p.add_argument("--repo-dir", required=True)
    p.add_argument("--target-ref", required=True)
    p.add_argument("--container")
    p.add_argument("--expected-plugins")
    p.add_argument("--expected-crons")
    p.add_argument("--json", action="store_true")
    return p.parse_args()

#
def main() -> int:
    args = parse_args()

    result: dict[str, Any] = {
        "ok": False,
        "phase": "starting",
        "targetRef": args.target_ref,
        "composeDir": args.compose_dir,
        "composeFile": args.compose_file,
        "service": args.service,
        "repoDir": args.repo_dir,
    }

    try:
        mode = resolve_mode(args)
#
        deploy_kind = resolve_deploy_kind(args)

        result["mode"] = mode
        result["deployKind"] = deploy_kind

        result["phase"] = "preflight"
        result["preflight"] = capture_pre_state(mode, args.container)

        result["phase"] = "pinning-repo"
        resolved_commit = pin_repo(args.repo_dir, args.target_ref)
        result["resolvedCommit"] = resolved_commit

        result["phase"] = "upgrading"
        upgrade_with_compose(args, deploy_kind)
#
        result["phase"] = "verifying"
        verification = call_verify(args, mode)
        result["verification"] = verification

        result["ok"] = bool(verification.get("ok"))
        result["phase"] = "complete"
        emit(result, args.json)
        return 0 if result["ok"] else EXIT_VERIFY

    except SystemExit as e:
        result["phase"] = "bad-input"
        result["error"] = str(e)
        emit(result, args.json)
        return EXIT_BAD_INPUT
    except VerifyFailed as e:
#
        result["phase"] = "verification"
        result["verification"] = e.payload
        emit(result, args.json)
        return EXIT_VERIFY
    except RuntimeError as e:
        result["error"] = str(e)
        emit(result, args.json)
        if result["phase"] in {"pinning-repo", "upgrading"}:
            return EXIT_UPGRADE
        return EXIT_PREFLIGHT
    except Exception as e:
        result["phase"] = "internal-error"
        result["error"] = repr(e)
        emit(result, args.json)
        return EXIT_INTERNAL
#


if __name__ == "__main__":
    raise SystemExit(main())

scripts/verify.py

#!/usr/bin/env python3
from __future__ import annotations

import argparse
import json
import subprocess
from typing import Any
#

EXIT_VERIFY = 40
EXIT_INTERNAL = 50


def run(cmd: list[str], *, check: bool = True) -> subprocess.CompletedProcess[str]:
    proc = subprocess.run(cmd, text=True, capture_output=True)
    if check and proc.returncode != 0:
        raise RuntimeError(
            f"command failed ({proc.returncode}): {' '.join(cmd)}\n"
            f"STDOUT:\n{proc.stdout}\n"
            f"STDERR:\n{proc.stderr}"
        )
    return proc
#

def run_json(cmd: list[str]) -> Any:
    proc = run(cmd)
    try:
        return json.loads(proc.stdout)
    except json.JSONDecodeError as e:
        raise RuntimeError(
            f"expected JSON from command but could not parse it: {' '.join(cmd)}\n"
            f"STDOUT:\n{proc.stdout}\n"
            f"STDERR:\n{proc.stderr}"
        ) from e


def emit(payload: dict[str, Any], as_json: bool) -> None:
    if as_json:
#
        print(json.dumps(payload, indent=2, sort_keys=True))
    else:
        print(payload)


def csv_set(raw: str | None) -> set[str]:
    if not raw:
        return set()
    return {item.strip() for item in raw.split(",") if item.strip()}


def openclaw_prefix(mode: str, container: str | None) -> list[str]:
    if mode == "host-cli":
        return ["openclaw"]
    if mode == "docker-exec":
#
        if not container:
            raise SystemExit("--container is required for docker-exec mode")
        return ["docker", "exec", container, "openclaw"]
    raise SystemExit(f"unsupported mode: {mode}")


def _extract_named_items(payload: Any) -> set[str]:
    found: set[str] = set()

    if isinstance(payload, dict):
        for key, value in payload.items():
            if key in {"id", "name", "pluginId", "jobId"} and isinstance(value, str) and value.strip():
                found.add(value.strip())
            found |= _extract_named_items(value)
#
    elif isinstance(payload, list):
        for item in payload:
            found |= _extract_named_items(item)

    return found


def normalize_plugin_ids(payload: Any) -> set[str]:
    if isinstance(payload, dict):
        if "plugins" in payload:
            return normalize_plugin_ids(payload["plugins"])
        if "items" in payload:
            return normalize_plugin_ids(payload["items"])

    if isinstance(payload, list):
#
        results: set[str] = set()
        for item in payload:
            if isinstance(item, dict):
                candidate = item.get("id") or item.get("name")
                if isinstance(candidate, str) and candidate.strip():
                    results.add(candidate.strip())
        if results:
            return results

    return _extract_named_items(payload)


def normalize_cron_ids(payload: Any) -> set[str]:
    if isinstance(payload, dict):
        if "jobs" in payload:
#
            return normalize_cron_ids(payload["jobs"])
        if "items" in payload:
            return normalize_cron_ids(payload["items"])

    if isinstance(payload, list):
        results: set[str] = set()
        for item in payload:
            if isinstance(item, dict):
                candidate = item.get("id") or item.get("name") or item.get("jobId")
                if isinstance(candidate, str) and candidate.strip():
                    results.add(candidate.strip())
        if results:
            return results

    return _extract_named_items(payload)
#


def parse_args() -> argparse.Namespace:
    p = argparse.ArgumentParser()
    p.add_argument("--mode", choices=["host-cli", "docker-exec"], required=True)
    p.add_argument("--container")
    p.add_argument("--expected-plugins")
    p.add_argument("--expected-crons")
    p.add_argument("--json", action="store_true")
    return p.parse_args()


def main() -> int:
    args = parse_args()
#
    expected_plugins = csv_set(args.expected_plugins)
    expected_crons = csv_set(args.expected_crons)

    result: dict[str, Any] = {
        "ok": False,
        "mode": args.mode,
    }

    try:
        oc = openclaw_prefix(args.mode, args.container)

        version = run(oc + ["--version"]).stdout.strip()
        status_payload = run_json(oc + ["status", "--json"])
        gateway_status_payload = run_json(oc + ["gateway", "status", "--json"])
        plugins_payload = run_json(oc + ["plugins", "list", "--json"])
#
        cron_status_payload = run_json(oc + ["cron", "status", "--json"])
        cron_list_payload = run_json(oc + ["cron", "list", "--json"])

        actual_plugins = normalize_plugin_ids(plugins_payload)
        actual_crons = normalize_cron_ids(cron_list_payload)

        missing_plugins = sorted(expected_plugins - actual_plugins)
        missing_crons = sorted(expected_crons - actual_crons)

        plugins_ok = not missing_plugins
        crons_ok = not missing_crons

        result.update(
            {
                "version": version,
#
                "checks": {
                    "status": True,
                    "gatewayStatus": True,
                    "pluginsList": True,
                    "cronStatus": True,
                    "cronList": True,
                },
                "plugins": {
                    "ok": plugins_ok,
                    "expected": sorted(expected_plugins),
                    "actual": sorted(actual_plugins),
                    "missing": missing_plugins,
                },
                "cron": {
                    "ok": crons_ok,
#
                    "expected": sorted(expected_crons),
                    "actual": sorted(actual_crons),
                    "missing": missing_crons,
                },
                "raw": {
                    "status": status_payload,
                    "gatewayStatus": gateway_status_payload,
                    "plugins": plugins_payload,
                    "cronStatus": cron_status_payload,
                    "cronList": cron_list_payload,
                },
            }
        )

        result["ok"] = plugins_ok and crons_ok
#
        emit(result, args.json)
        return 0 if result["ok"] else EXIT_VERIFY

    except SystemExit:
        raise
    except Exception as e:
        result["error"] = repr(e)
        emit(result, args.json)
        return EXIT_INTERNAL


if __name__ == "__main__":
    raise SystemExit(main())

First thing I’d test

#

Before wiring the full skill, run verify.py directly once and inspect the JSON shapes:

python3 scripts/verify.py --mode host-cli --json

or:

python3 scripts/verify.py --mode docker-exec --container <your-container> --json

That’ll tell you whether you need to tighten the normalize_*() helpers.

If you want, next I can give you a tiny README-style test checklist for validating the skill end to end before you hand it to Ditto.

winged briar
#

sure

sonic deltaBOT
#

Nice. Here’s a tiny end-to-end test checklist I’d use before handing it to Ditto.

1. Basic file sanity

python3 -m py_compile scripts/upgrade-driver.py scripts/verify.py

2. Verify the skill is discoverable

Put it in ~/.openclaw/skills/openclaw-upgrade/, then start a fresh session or restart the gateway, then:

openclaw skills list --eligible --verbose

You want to see:

  • openclaw-upgrade listed
  • no missing-bin surprises
#
  • no metadata/frontmatter issues

3. Test verify.py by itself first

Host CLI mode:

python3 scripts/verify.py --mode host-cli --json

Container mode:

python3 scripts/verify.py --mode docker-exec --container <container-name> --json

Check:

  • JSON parses
#
  • version is present
  • plugins.actual looks sane
  • cron.actual looks sane

If actual is empty or weird, tighten normalize_plugin_ids() / normalize_cron_ids().

4. Test expected-set matching

python3 scripts/verify.py \
  --mode host-cli \
  --expected-plugins browser,discord \
  --expected-crons nightly-summary \
  --json

Check both:

  • happy path with real IDs
#
  • failing path with one fake plugin/job

You want missing items to show up clearly.

5. Test upgrade-driver.py on a safe target

Use a staging or no-risk deployment first.

python3 scripts/upgrade-driver.py \
  --mode auto \
  --deploy-kind auto \
  --compose-dir <compose-dir> \
  --compose-file docker-compose.yml \
  --service <service> \
  --repo-dir <repo-dir> \
  --target-ref <safe-ref> \
#
  --container <container-name> \
  --json

Check:

  • resolvedCommit is correct
  • mode resolves the way you expect
  • verification.ok propagates correctly
  • non-zero exit on failure

6. Test skill invocation from OpenClaw chat

In a fresh session, ask something like:

use the openclaw-upgrade skill to verify the deployment state only

You want to confirm:

#
  • the agent notices the skill
  • it calls the script entrypoint
  • it does not improvise a totally different workflow

7. Test failure behavior on purpose

Do one controlled failure:

  • wrong container name
  • fake cron ID
  • fake plugin ID

You want:

  • clean non-zero exit
  • concise failure output
  • no false “success”

8. Test the Ditto/HostOps path

Make sure the agent config explicitly exposes the skill if needed:

#
agents: {
  list: [
    {
      id: "hostops",
      skills: ["openclaw-upgrade"]
    }
  ]
}

Then start a fresh HostOps session and verify it can see/use the skill.


My recommendation

Do the tests in this order:

#
  1. py_compile
  2. verify.py
  3. upgrade-driver.py
  4. chat invocation
  5. Ditto

That catches the boring bugs early, which is exactly what you want for ops automation.

If you want, I can do one last pass and turn this into a staging-ready checklist message you can save next to the skill as references/test-plan.md.

winged briar
#

sure

sonic deltaBOT
#

Nice. Here’s a references/test-plan.md you can drop straight into the skill.

# OpenClaw Upgrade Skill Test Plan

Use this checklist before allowing the skill to run unattended via HostOps / Ditto.

## Goal

Verify that the `openclaw-upgrade` skill:

- is discovered by OpenClaw
- invokes the expected script entrypoint
- upgrades only the intended deployment
- verifies plugin and cron state correctly
- fails safely and clearly
#
- works for both interactive and autonomous consumers

---

## Preconditions

Before testing, confirm:

- the skill exists at `~/.openclaw/skills/openclaw-upgrade/`
- `SKILL.md` is present
- `scripts/upgrade-driver.py` is present
- `scripts/verify.py` is present
- Python 3 is available
- Docker is available
- the target environment is staging or otherwise safe to modify
#

---

## Phase 1: Script sanity

### 1. Python syntax check

Run:

```bash
python3 -m py_compile scripts/upgrade-driver.py scripts/verify.py

Pass criteria:

#
- both scripts compile successfully
- no syntax errors

### 2. Standalone verification, host CLI mode

Run:

```bash
python3 scripts/verify.py --mode host-cli --json

Pass criteria:

  • command exits successfully
  • JSON is emitted
#
- `version` is present
- `plugins.actual` is populated sensibly
- `cron.actual` is populated sensibly

### 3. Standalone verification, container mode

Run:

```bash
python3 scripts/verify.py --mode docker-exec --container <container-name> --json

Pass criteria:

  • command exits successfully
#
- JSON is emitted
- output is materially similar to host CLI mode, if both paths are expected to work

### 4. Expected-set validation

Run:

```bash
python3 scripts/verify.py \
  --mode host-cli \
  --expected-plugins <known-good-plugin-list> \
  --expected-crons <known-good-cron-list> \
  --json
#
Pass criteria:

- expected items are marked present
- no false missing entries

### 5. Controlled failure validation

Run verification with one fake plugin ID and one fake cron ID.

Pass criteria:

- script exits non-zero
- missing items are listed clearly
- result does not report success
#
---

## Phase 2: Upgrade-driver behavior

### 6. Preflight and JSON contract

Run:

```bash
python3 scripts/upgrade-driver.py \
  --mode auto \
  --deploy-kind auto \
  --compose-dir <compose-dir> \
  --compose-file docker-compose.yml \
  --service <service> \
  --repo-dir <repo-dir> \
#

--target-ref <safe-ref>
--container <container-name>
--json


Pass criteria:

- JSON result is emitted
- `mode` is resolved as expected
- `deployKind` is resolved as expected
- `resolvedCommit` is present
- `verification` block is present

### 7. Safe upgrade on staging

Run against a staging deployment or low-risk target.
#

Pass criteria:

- only the intended service is upgraded
- repo is pinned to the requested ref
- post-upgrade verification runs automatically
- final status is correct

### 8. Controlled upgrade failure

Induce one safe failure, for example:
- invalid container name
- invalid compose service
- invalid repo path
- expected fake cron ID
#

Pass criteria:

- script exits non-zero
- failure phase is clear
- no false success
- output includes enough detail for debugging

---

## Phase 3: Skill loading and agent behavior

### 9. Skill discovery

Start a fresh session or restart the gateway, then run:
#

```bash
openclaw skills list --eligible --verbose

Pass criteria:

  • openclaw-upgrade appears
  • no missing-bin surprises
  • no frontmatter or metadata parsing issues

10. Interactive invocation test

In a fresh OpenClaw chat session, ask the agent to use the skill for verification-only work first.

#
Example prompt:

> use the openclaw-upgrade skill to verify the current deployment state only

Pass criteria:

- the agent notices and uses the skill
- it invokes `upgrade-driver.py`
- it does not improvise a different workflow unnecessarily

### 11. HostOps / Ditto visibility test

If using agent allowlists, confirm the HostOps agent exposes the skill.

Example config shape:
#

```json5
agents: {
  list: [
    {
      id: "hostops",
      skills: ["openclaw-upgrade"]
    }
  ]
}

Pass criteria:

  • HostOps can see the skill
#
- HostOps can invoke it
- the skill behaves the same as in the interactive path

---

## Release criteria

Do not allow autonomous use until all of the following are true:

- scripts compile
- `verify.py` passes in the intended mode
- expected plugin/crons matching works
- controlled failure path behaves safely
- staging upgrade succeeds
- skill is discoverable by OpenClaw
#
- interactive invocation works
- HostOps / Ditto can access the skill
- final output is concise and operationally useful

---

## Recommended test order

1. syntax check
2. standalone `verify.py`
3. expected-set matching
4. controlled verification failure
5. standalone `upgrade-driver.py`
6. staging upgrade
7. skill discovery
#
8. interactive invocation
9. HostOps / Ditto invocation

---

## Notes

- Prefer testing `verify.py` before any real upgrade action.
- Prefer staging before production.
- If plugin or cron normalization is wrong, fix `normalize_plugin_ids()` and `normalize_cron_ids()` before testing autonomous flows.
- If mode detection is ambiguous, prefer explicit `--mode` and `--deploy-kind` during early testing.

If you want, I can also do one final cleanup pass and produce a leaner v1 bundle with fewer moving parts, which might be smarter for the first real deployment.

winged briar
#

ok

sonic deltaBOT
#

Sounds good 👍

If you hit anything weird while wiring it up, especially the JSON shape from openclaw plugins list --json or openclaw cron list --json, paste a redacted sample and I can help tighten the parser.

Glad that helped! 🎉
If this fixed it, please right-click the solution message → Apps → ✅ Mark Solution.
This helps Answer Overflow index the solution for future users.

pseudo shaleBOT