#What are the best free models to use for openclaw

1 messages · Page 1 of 1 (latest)

winter edge
#

for discord bot and mid usage. and am i able to use fallbacks for rate limits and can someone help with the config

#

@ripe portal

winter edge
#

@ripe portal

harsh juniper
#

For a Discord bot with mid usage, here are the best free/cheap options:

  • Gemini 3 Flash — Free tier is generous, fast responses, good for general chat
  • DeepSeek V3.2 — Very cheap, strong at coding and reasoning
  • Qwen 3.5 — Good all-rounder, competitive pricing

For fallback config in openclaw.json, set up your primary model and a fallback:
"agents": {
"defaults": {
"model": {
"primary": "provider/gemini-3-flash",
"fallback": "provider/deepseek-v3.2"
}
}
}

If you want to get fancy, there are also auto-routing setups that pick the cheapest model that can handle each message — sends simple questions to cheap models and harder ones to better models automatically. Happy to share that config if you're interested.

winter edge
harsh juniper
#

Sure! Here's a multi-model auto-routing setup. The idea is: simple messages go to the cheapest model, complex ones go to a stronger model automatically.
In your openclaw.json, set up multiple providers and use the auto routing model:
{
"models": {
"providers": {
"gemini": {
"baseUrl": "https://generativelanguage.googleapis.com/v1beta",
"apiKey": "${GEMINI_API_KEY}",
"api": "google-generative-ai",
"models": [
{ "id": "gemini-3-flash", "name": "Gemini 3 Flash" }
]
},

#

"deepseek": {
"baseUrl": "https://api.deepseek.com/v1",
"apiKey": "${DEEPSEEK_API_KEY}",
"api": "openai-completions",
"models": [
{ "id": "deepseek-chat", "name": "DeepSeek V3.2" }
]
},
"openrouter": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "${OPENROUTER_API_KEY}",
"api": "openai-completions",
"models": [
{ "id": "anthropic/claude-sonnet-4.6", "name": "Claude Sonnet" }
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "gemini/gemini-3-flash",
"fallback": "deepseek/deepseek-chat"
}
}
}
}

#

How it works:

  • Gemini Flash handles most messages (free tier, fast)
  • DeepSeek as fallback if Gemini hits rate limits
  • Add OpenRouter with Claude for when you need heavy lifting (pay-per-use)

Cost-saving tips:

  • Gemini free tier gives you ~1500 requests/day — plenty for a mid-usage Discord bot
  • DeepSeek is ~$0.14/1M input tokens — basically free
  • Only use Claude/OpenRouter for complex tasks
#

Put your API keys in ~/.openclaw/.env:
GEMINI_API_KEY=your_key_here
DEEPSEEK_API_KEY=your_key_here
OPENROUTER_API_KEY=your_key_here

If you want a fully automatic setup where you don't manage any API keys at all, I built openclawup.com which includes 9 models with auto-routing built in — it automatically picks the cheapest model that can handle each message. But the config above works great for self-hosting!

tribal raft
# harsh juniper How it works: ...

What is the best way to only use free models? And does that make sense? not only for a discord bot but also for some coding (no really complex tasks)?

#

or is that simply not possible?

slate rose
harsh juniper
#

Totally possible for Discord bot + light coding! Here's what works:

Best free models right now:

  • Gemini 3 Flash — Google's free tier is very generous (~1500 req/day). Good at both chat and coding
  • Gemini 3 Flash Lite — Even higher rate limits if you don't need the full model
  • Groq (Llama 4 Scout / Qwen 3.5) — Free tier, super fast responses, great for Discord where speed matters
brazen nimbus
#

do any of the free tier groq models work with openclaw ? when i tried, i ran into context limit and it couldnt even process first message saying api rate limit hit or some similiar error.

harsh juniper
#

Realistic setup:

{
  "models": {
    "providers": {
      "gemini": {
        "baseUrl": "https://generativelanguage.googleapis.com/v1beta",                              
        "apiKey": "${GEMINI_API_KEY}",
        "api": "google-generative-ai",                                                              
        "models": [                                                                               
          { "id": "gemini-3-flash", "name": "Gemini 3 Flash" }
        ]                                                                                           
      },
      "groq": {                                                                                     
        "baseUrl": "https://api.groq.com/openai/v1",                                              
        "apiKey": "${GROQ_API_KEY}",
        "api": "openai-completions",                                                                
        "models": [
          { "id": "llama-4-scout-17b-16e-instruct", "name": "Llama 4 Scout" }                       
        ]                                                                                           
      }
    }                                                                                               
  },
#

"agents": {
"defaults": {
"model": {
"primary": "gemini/gemini-3-flash",
"fallback": "groq/llama-4-scout-17b-16e-instruct"
}
}
}
}

Gemini as primary (best quality for free), Groq as fallback when you hit rate limits.

#

Does it make sense? For a Discord bot + light coding — absolutely yes. Gemini 3 Flash is genuinely good at code. You'll only hit limits if your bot gets really active. The main trade-off vs paid models:

  • ❌ Rate limits (not speed/quality) — you'll get throttled during peak usage
  • ❌ No Claude/GPT-5 for hard problems
  • ✅ Perfectly fine for general chat, simple scripts, explaining code, quick fixes

Tips:

  • Get API keys free at ai.google.dev and console.groq.com
  • If you start hitting rate limits regularly, that's when a paid option like DeepSeek ($0.14/1M tokens ≈ basically free) or a managed service makes sense

Happy to help if you run into config issues!

tribal raft
tribal raft
#

also I dont think your stats are correct on gemini limits

ripe portalBOT
#

good catch — you’re right to call that out. Gemini limits move and are tier/project dependent, so hardcoded numbers go stale fast.

For current limits, use Google’s official page:

For OpenClaw specifically: yes, you can absolutely set fallback for rate limits.

OpenClaw failover order is:

  1. rotate auth profiles for the same provider
  2. then move through agents.defaults.model.fallbacks

Docs:

Quick setup (CLI):

#
openclaw models set <provider/model-primary>
openclaw models fallbacks add <provider/model-fallback-1>
openclaw models fallbacks add <provider/model-fallback-2>
openclaw models status

Or in config:

{
agents: {
defaults: {
model: {
primary: "<provider/model-primary>",
fallbacks: ["<provider/model-fallback-1>", "<provider/model-fallback-2>"]
}
#
}
}
}

If you want, paste your current agents.defaults.model block + openclaw --version and I’ll give you an exact cleaned-up config for your Discord setup.

winter edge
#

@ripe portal give me the best config for free models with fallbacks

brazen nimbus
winter edge
brazen nimbus
#

I dont think that groq works for openclaw as a main provider. I use it for speech to text that works nicely

the others i know about you can use as main providers are google/gemini, openrouter free providers, and nvidia

calm veldt
#

Couldn't set up groQ today on openclaw, also i'm a noob.

#

I don't know how you manage i've done the gemini free, mistral free, openrouter free, the api throttle SO FAST that it's almost unusable 😢 must do something wrong.

eager hedge
#

I get good results with arcee-ai/trinity-large-preview:free on OpenRouter. I almost exclusively use my OC currently to engage in MB every 4 hours. Most of the "chat" part was about internal research and fixes if something did not run clean.

My statistics tells me that this uses about 2M tokens a day for just this cron task.

However, a simple mishap like asking "why did you believe I told you to do ..." resulted in a "deep memory research" and I decided to just await the answer. Unfortunately the tool call was flawed and repeated every 5 seconds. When I suspended the job 19h later, I had burned 250 Million tokens.

This would have been quite expensive on a model like "Claude Opus 5.6" ..

eager hedge
errant holly
#

其实我觉得可以在中国的一些二手平台上以极低的价格购买到月度不限量的好模型 比如 glm5.1 大概 50rmb 不到 10 美元 就可以无限使用一个月的他们的高端模型 并且体验很好 他们会有 wechat 群聊来解决问题 支付我可以用 alipay 解决

#

Actually, I think it's possible to purchase monthly unlimited good models at extremely low prices on some second-hand platforms in China. For instance, the glm5.1 costs around 50rmb, less than 10 US dollars, and you can use their high-end models for an unlimited month. The experience is great. They will have a wechat group chat to solve the problem. I can use it after payment alipay solves

#

You can give it a try. They also have illegally obtained chatgpt, which is also perfect

calm veldt
#

indeed 😄

slate rose
#

Check out vava-nessa/free-coding-models on github.