Openclaw reads YouTube transcript | Friends of the Crustacean 🦞🤝 | Page 1

copper flameBOT Mar 10, 2026, 7:54 AM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

tepid leafBOT Mar 10, 2026, 7:57 AM

#

OpenClaw doesn’t have a built-in “YouTube transcript” tool, and web_fetch usually won’t work on youtube.com/watch… because YouTube is JS-heavy (and often anti-bot). The reliable ways are:

Option A (most reliable): use the browser tool and “Show transcript”

Enable/allow the browser tool for the agent.
Either:
- use OpenClaw’s managed browser, or
- use the Chrome extension (browser relay) so it can drive your logged-in YouTube tab (best if YouTube nags about consent/login).
Ask your agent: “Open this video, click Show transcript, copy the transcript text, then summarize it.”

Docs:

Browser tool: https://docs.openclaw.ai/tools/browser
Chrome extension: https://docs.openclaw.ai/tools/chrome-extension

Option B (fast + automation-friendly): `yt-dlp` via exec (captions/subtitles)

If the video has captions/auto-captions, install yt-dlp (+ often ffmpeg) on the machine where exec runs, then have the agent run something like:

download subtitles (vtt) → convert to plain text → summarize.

#

Docs:

Exec tool: https://docs.openclaw.ai/tools/exec
Creating skills (so your agent “knows” this workflow): https://docs.openclaw.ai/tools/creating-skills

Option C: no captions available → you need speech-to-text

If the video has no transcript/captions, you’ll need to download audio and run STT (Whisper/etc). That’s a bigger pipeline.

If you tell me which setup you’re on (Gateway local vs VPS, and whether you’re okay installing yt-dlp), I’ll recommend the cleanest path and a copy-paste “skill” instruction for it.

deep kelp Mar 12, 2026, 3:39 PM

#

I've been getting blocked lately on yt-dlp via summarize skill while running openclaw in a vps environment. It looks like YouTube has increased it's anti-bot protections, so the exec version of this will be difficult but not impossible.

tepid leafBOT Mar 12, 2026, 3:41 PM

#

Yeah, that matches what a bunch of folks are seeing: datacenter/VPS IP ranges get hit with much stricter YouTube bot checks, so yt-dlp that was “fine last month” suddenly starts failing.

The clean OpenClaw-ish way around this (without turning it into an anti-bot arms race) is:

1) Keep the Gateway on the VPS, but run the “transcript fetch” on a node at home

Run a node host on a machine with a normal residential IP (or wherever YouTube works reliably), then have your summarize workflow use:

exec host=node for yt-dlp, and/or
browser profile=chrome (extension relay) to click “Show transcript” and copy it.

Docs:

Node host (“system.run” / host=node): https://docs.openclaw.ai/nodes
Exec tool host=node + binding a node: https://docs.openclaw.ai/tools/exec
Chrome extension relay: https://docs.openclaw.ai/tools/chrome-extension

This tends to solve 90% of the “VPS got blocked” pain immediately, because you’re no longer scraping from a flagged IP range.

#

2) Add a fallback path in the skill: Browser transcript > yt-dlp > (give up)

In practice:

Try browser transcript extraction first (most “human-like”, also handles consent/login prompts).
If that fails and captions exist, try yt-dlp.
If both fail, return: “No transcript available / blocked” (and optionally ask the user to attach a Chrome tab).

3) If you must stay VPS-only

You’re basically stuck in “cat and mouse” territory. You can sometimes improve reliability with authenticated/session-based access, but I’d strongly recommend the node-host approach above instead—it’s simpler and more robust.

If you want, paste the specific yt-dlp error output you’re seeing (403? sign-in required? “bot detection”?), and I can suggest what the least gross fallback should be for that failure mode inside the skill.

inner obsidian Apr 28, 2026, 9:28 AM

#

tepid leafBOT Apr 28, 2026, 9:29 AM

#

If you mean “same thing, but for Reddit”: yep — usually easier than YouTube.

Best fallback order:

web_fetch first for public Reddit URLs.
If Reddit serves a JS/login/blocked page, use browser instead.
For private/quarantined/login-only subreddits, the user needs to open it in their own logged-in browser session; don’t try to bypass access.

Reusable skill shape:

When asked to summarize a Reddit URL:
- Try web_fetch first.
- If comments/post body are missing, use browser.
- Summarize the OP, top comment themes, disagreements, and notable links.
- If access is blocked/login-only, say so clearly.

#Openclaw reads YouTube transcript

Option A (most reliable): use the browser tool and “Show transcript”

Option B (fast + automation-friendly): yt-dlp via exec (captions/subtitles)

Option C: no captions available → you need speech-to-text

1) Keep the Gateway on the VPS, but run the “transcript fetch” on a node at home

2) Add a fallback path in the skill: Browser transcript > yt-dlp > (give up)

3) If you must stay VPS-only

Option B (fast + automation-friendly): `yt-dlp` via exec (captions/subtitles)