#How do I get my agent to join google meets
1 messages · Page 1 of 1 (latest)
Short answer: no, you do not need Twilio for normal Google Meet joining. Use Chrome/chrome-node for a real Meet participant. Twilio is only for phone dial-in when the Meet has a dial-in number + PIN/DTMF. (docs.openclaw.ai)
Step-by-step for the normal Chrome path:
- On the machine that will run Chrome/audio — usually your Mac, or a Mac VM/node:
brew install blackhole-2ch sox
sudo reboot
After reboot:
system_profiler SPAudioDataType | grep -i BlackHole
command -v rec play
- On the Gateway host, make sure you have a realtime voice provider key:
export OPENAI_API_KEY=sk-...
# or
export GEMINI_API_KEY=...
- Enable the Google Meet plugin in
~/.openclaw/openclaw.json:
{
plugins: {
entries: {
"google-meet": {
enabled: true,
config: {
defaultTransport: "chrome",
defaultMode: "realtime"
}
}
}
}
}
Then restart/reload the gateway.
- Preflight it:
openclaw googlemeet setup --transport chrome
Treat any ok: false as a blocker.
- Join/test a real meeting:
openclaw googlemeet test-speech https://meet.google.com/abc-defg-hij \
--transport chrome \
--message "Say exactly: Google Meet speech test complete."
If that works, your agent can join with:
openclaw googlemeet join https://meet.google.com/abc-defg-hij \
--transport chrome \
--mode realtime
Or ask your agent: “Join this Google Meet with realtime voice: …”
If your Gateway is on Linux/VPS and Chrome is on another Mac/VM, use chrome-node instead: install BlackHole/SoX + Chrome on the node, run openclaw plugins enable google-meet and openclaw plugins enable browser there, connect/approve the node, then set defaultTransport: "chrome-node" and chromeNode.node: "<node-name>".
OAuth is not required just to join. It’s mainly for official Meet API actions like creating/resolving Meet spaces. Chrome join just needs a usable Chrome profile/browser path.
Docs: https://docs.openclaw.ai/plugins/google-meet
If it fails, paste:
openclaw --version
openclaw googlemeet setup --transport chrome --json
So if my gateway is headless will it work?
Yes — a headless Gateway can work, but the Meet Chrome/audio part cannot be purely headless in the normal Chrome path.
Think of it like this:
- Gateway/headless VPS: runs OpenClaw, agent, model keys, realtime provider.
- Chrome host/node: runs Google Chrome + audio bridge. This should be a Mac/desktop/VM with Chrome, SoX, and BlackHole.
So on a headless Gateway, use chrome-node, not local chrome.
Minimal shape:
{
gateway: {
nodes: {
allowCommands: ["googlemeet.chrome", "browser.proxy"]
}
},
plugins: {
entries: {
"google-meet": {
enabled: true,
config: {
defaultTransport: "chrome-node",
defaultMode: "realtime",
chromeNode: {
node: "your-node-name"
}
}
}
}
}
}
On the Chrome node machine:
brew install blackhole-2ch sox
openclaw plugins enable google-meet
openclaw plugins enable browser
openclaw node run --host <gateway-host> --port 18789 --display-name your-node-name
Then from the Gateway host:
openclaw devices list
openclaw devices approve <requestId>
openclaw nodes status
openclaw googlemeet setup --transport chrome-node
openclaw googlemeet test-speech https://meet.google.com/abc-defg-hij --transport chrome-node
Twilio is the alternative for fully headless operation, but only if the Meet exposes a phone dial-in number/PIN. That joins as a phone call, not a browser Meet participant.
Docs: https://docs.openclaw.ai/plugins/google-meet
If you want, paste openclaw --version + where your Gateway is running, and I can map the exact setup.
so what I'd have to have my gateway connected to another device that can open a chrome browser?
Yep — for the normal Google Meet browser participant path, yes.
Your setup would be:
Headless Gateway / VPS
runs: OpenClaw gateway, agent, model keys, realtime voice
Desktop/Mac/VM node
runs: Chrome, signed-in Google profile, BlackHole, SoX
connects back to Gateway as an OpenClaw node
Then the agent tells the node: “open this Meet URL in Chrome, join, route audio through the realtime bridge.”
So you have three practical options:
- Best/normal: headless Gateway + separate Chrome-capable node (
chrome-node) - Simpler if available: run Gateway on the same desktop/Mac that can open Chrome (
chrome) - Fully headless fallback: Twilio phone dial-in, but only if that Meet has a dial-in number/PIN
So yes: if your Gateway is truly headless and you want the agent to join as a Google Meet web participant, you need another connected device/VM that can run Chrome.
so is it best to use ubuntu desktop
I wouldn’t call Ubuntu Desktop the best path right now.
The best-supported/easiest path in the docs is currently:
Headless Gateway/VPS
+ macOS Chrome node/VM
- Google Chrome
- signed-in Google profile
- BlackHole 2ch
- SoX
Reason: the Google Meet plugin’s realtime Chrome audio path is built around BlackHole 2ch, which is a macOS virtual audio device. So a Mac, Mac mini, or Parallels macOS VM is the smoother route.
Ubuntu Desktop might be possible in theory if you build a working virtual audio setup with PipeWire/PulseAudio and custom audio commands, but that’s more experimental/DIY and not the path I’d recommend unless you specifically want to debug audio plumbing.
So my recommendation:
- If you have a Mac available: use that as the
chrome-node. - If your Gateway is on a VPS: keep it there, connect the Mac as a node.
- If you only have Linux/headless: Twilio dial-in is probably simpler, but only works for Meets with phone dial-in.
I don't own a mac I have a windows device and a spare device which runs ubuntu desktop
At the moment my openclaw is ran on a vps though
With only Windows + Ubuntu Desktop, I’d avoid the Chrome-node route for now.
Current Google Meet browser joining is effectively macOS Chrome-node focused because the realtime audio bridge expects BlackHole 2ch, which is macOS-only. So your Ubuntu Desktop device is not the smooth supported replacement for a Mac here.
Your realistic options:
-
Use Twilio
Best fit for your hardware. This joins via the Meet’s phone dial-in number/PIN, not Chrome. Works only if that Meet exposes dial-in details. -
Get access to a Mac/Mac mini/macOS node
Then use the normalchrome-nodesetup. -
Experimental DIY on Ubuntu
Maybe possible someday with custom PipeWire/PulseAudio audio commands, but I would not recommend it unless you’re ready to debug audio/browser plumbing.
So in your case: Twilio is probably the practical path, unless you can add a Mac as the Chrome node.
Got it — with OpenClaw on a VPS, your best practical setup is:
VPS OpenClaw Gateway
→ Twilio dial-in transport
→ Google Meet phone participant
Because the VPS can’t realistically run the supported Chrome/audio path by itself.
So you do not need another device if you use Twilio, but the meeting must provide dial-in details.
Rough setup:
{
plugins: {
allow: ["google-meet", "voice-call"],
entries: {
"google-meet": {
enabled: true,
config: {
defaultTransport: "twilio"
}
},
"voice-call": {
enabled: true,
config: {
provider: "twilio"
}
}
}
}
}
Set Twilio creds on the VPS:
export TWILIO_ACCOUNT_SID=AC...
export TWILIO_AUTH_TOKEN=...
export TWILIO_FROM_NUMBER=+15550001234
Restart the gateway, then check:
openclaw googlemeet setup --transport twilio
Join with:
openclaw googlemeet join https://meet.google.com/abc-defg-hij \
--transport twilio \
--dial-in-number +15551234567 \
--pin 123456
If the Meet invite doesn’t include a dial-in number/PIN, Twilio won’t help. Then you’d need a Mac/macOS VM as a chrome-node.
Also: can you run/paste this first?
openclaw --version
Just to make sure you’re on a version with the current Google Meet plugin behavior.
how much does twilio cost
Assuming the Meet dial-in number is US/Canada, it’s pretty cheap:
- Outbound call: about $0.014/min
- Twilio local number: about $1.15/month
- So a 1-hour Meet dial-in is roughly $0.84 in Twilio call minutes, plus the monthly number cost.
Twilio lists US/Canada “make calls” at $0.0140/min, and local numbers at $1.15/mo. Rates vary if the Meet dial-in number is in another country. (twilio.com)
Important: that’s only the Twilio phone leg. Your realtime AI/model usage is separate.
im in the uk
For the UK, if the Google Meet gives you a UK dial-in number, Twilio’s current UK voice pricing is roughly:
- Call UK local/landline dial-in: $0.0158/min
- Call UK mobile: $0.0305/min
- Twilio UK local number: $3.50/month
- Twilio UK mobile number: $2.50/month (twilio.com)
So a 1-hour Meet via UK local dial-in would be about:
60 Ă— $0.0158 = $0.948
So basically ~$0.95 per hour, plus the small monthly Twilio number rental. If the Meet dial-in is mobile/special/premium, it can be higher, so check the number type before relying on it. Twilio says outbound price varies by destination. (twilio.com)
What about if the agent is just transcribing the google meet?
“Just transcribing” has two different meanings here:
1. Live transcribing while the meeting is happening
mode: "transcribe" still needs a participant path into the Meet.
So on your VPS-only setup, it doesn’t magically avoid Chrome/Twilio. It just means:
join/control Meet browser
but do NOT start realtime voice talk-back
So it still needs Chrome/chrome-node for browser joining — currently best-supported on macOS because of the BlackHole audio path — or Twilio if joining by phone.
Example:
openclaw googlemeet join https://meet.google.com/abc-defg-hij \
--transport chrome-node \
--mode transcribe
But that still needs a working Chrome node.
2. Getting the transcript after the meeting
This is more VPS-friendly.
If Google Meet itself generated a transcript/artifacts, OpenClaw can fetch/export those via Google Meet/Drive APIs after the call. That does not require Chrome/Twilio participation.
Commands look like:
openclaw googlemeet artifacts --meeting https://meet.google.com/abc-defg-hij
openclaw googlemeet export --meeting https://meet.google.com/abc-defg-hij --output ./meet-export
But the catch: Google must actually have a transcript/artifact available, and your Google OAuth account must have access to it.
So for your VPS:
- Live listening/transcribing: likely Twilio is your practical route.
- After-meeting transcript export: possible with Google OAuth, no Chrome node needed.
- Browser participant transcription: needs a Chrome-capable node, ideally macOS right now.
yeah so if im using Twilio
Yep — if you use Twilio, your VPS setup is the right shape.
Twilio path = OpenClaw dials the Google Meet’s phone dial-in number, enters the PIN/DTMF, and the agent participates over the phone leg.
Key requirements:
-
Google Meet invite must include phone dial-in details
- UK/US dial-in number etc.
- PIN / meeting code
- OpenClaw does not scrape the number from the Meet page; you pass it in.
-
Voice Call plugin on the VPS
openclaw plugins install @openclaw/voice-call -
Public webhook URL
Twilio must be able to reach your VPS over HTTPS for call events/audio. So configure publicUrl, ngrok, Tailscale funnel, or your own reverse proxy.
- Pick one audio mode
- If you want the agent to talk/listen live: use
realtime.enabled: true - If you only want transcription: use
streaming.enabled: true - Don’t enable both at once.
- If you want the agent to talk/listen live: use
Basic config shape:
{
plugins: {
allow: ["google-meet", "voice-call"],
entries: {
"google-meet": {
enabled: true,
config: {
defaultTransport: "twilio"
}
},
"voice-call": {
enabled: true,
config: {
provider: "twilio",
fromNumber: "+44...",
publicUrl: "https://your-domain.example/voice/webhook",
twilio: {
accountSid: "${TWILIO_ACCOUNT_SID}",
authToken: "${TWILIO_AUTH_TOKEN}"
},
// choose realtime OR streaming
realtime: {
enabled: true,
provider: "openai"
}
}
}
}
}
}
Then:
openclaw voicecall setup
openclaw googlemeet setup --transport twilio
Join a Meet:
openclaw googlemeet join https://meet.google.com/abc-defg-hij \
--transport twilio \
--dial-in-number +4420... \
--pin 123456
If the PIN flow is weird, use DTMF directly:
openclaw googlemeet join https://meet.google.com/abc-defg-hij \
--transport twilio \
--dial-in-number +4420... \
--dtmf-sequence ww123456#
Docs: https://docs.openclaw.ai/plugins/google-meet and https://docs.openclaw.ai/plugins/voice-call
what voice does it use
Twilio itself doesn’t decide the AI voice — OpenClaw’s voice-call config does.
For your Twilio Meet setup:
- If using realtime voice:
voice comes from the realtime provider.- OpenAI realtime default voice:
alloy - Google Gemini Live default voice:
Kore
- OpenAI realtime default voice:
Example OpenAI voice override:
{
plugins: {
entries: {
"voice-call": {
config: {
realtime: {
enabled: true,
provider: "openai",
providers: {
openai: {
voice: "marin"
}
}
}
}
}
}
}
}
Example Gemini:
{
realtime: {
enabled: true,
provider: "google",
providers: {
google: {
voice: "Kore"
}
}
}
}
- If using streaming transcription only:
there’s no voice — it just listens/transcribes.
- If using normal TTS instead of realtime:
it usesmessages.tts/voice-call.config.tts, e.g. OpenAI, ElevenLabs, etc.
so it cant use edge tts
For Twilio voice-call / Google Meet dial-in, I would treat Edge/Microsoft TTS as not supported.
The voice-call docs specifically say Microsoft speech is ignored for voice calls because telephony needs PCM output, and the current Microsoft transport doesn’t expose that telephony PCM path.
Use one of these instead:
- Realtime voice: OpenAI realtime or Google Gemini Live
- TTS playback: OpenAI TTS or ElevenLabs
- Transcription-only: no voice needed
So for your setup, I’d use OpenAI realtime first — simplest config/debug path.
can i use the openai voice for when my agent sends me voice notes too currently it uses edge tts
Yes — for agent voice notes / outbound TTS, set messages.tts.provider: "openai".
Example ~/.openclaw/openclaw.json:
{
messages: {
tts: {
auto: "always",
provider: "openai",
providers: {
openai: {
apiKey: "${OPENAI_API_KEY}",
model: "gpt-4o-mini-tts",
voice: "coral"
}
}
}
}
}
Then on your VPS:
export OPENAI_API_KEY=sk-...
openclaw config validate
openclaw gateway restart
Check it with: