#What are the best free models to use for openclaw
1 messages · Page 1 of 1 (latest)
@ripe portal
For a Discord bot with mid usage, here are the best free/cheap options:
- Gemini 3 Flash — Free tier is generous, fast responses, good for general chat
- DeepSeek V3.2 — Very cheap, strong at coding and reasoning
- Qwen 3.5 — Good all-rounder, competitive pricing
For fallback config in openclaw.json, set up your primary model and a fallback:
"agents": {
"defaults": {
"model": {
"primary": "provider/gemini-3-flash",
"fallback": "provider/deepseek-v3.2"
}
}
}
If you want to get fancy, there are also auto-routing setups that pick the cheapest model that can handle each message — sends simple questions to cheap models and harder ones to better models automatically. Happy to share that config if you're interested.
Yeah sounds good can u share that config?
Sure! Here's a multi-model auto-routing setup. The idea is: simple messages go to the cheapest model, complex ones go to a stronger model automatically.
In your openclaw.json, set up multiple providers and use the auto routing model:
{
"models": {
"providers": {
"gemini": {
"baseUrl": "https://generativelanguage.googleapis.com/v1beta",
"apiKey": "${GEMINI_API_KEY}",
"api": "google-generative-ai",
"models": [
{ "id": "gemini-3-flash", "name": "Gemini 3 Flash" }
]
},
"deepseek": {
"baseUrl": "https://api.deepseek.com/v1",
"apiKey": "${DEEPSEEK_API_KEY}",
"api": "openai-completions",
"models": [
{ "id": "deepseek-chat", "name": "DeepSeek V3.2" }
]
},
"openrouter": {
"baseUrl": "https://openrouter.ai/api/v1",
"apiKey": "${OPENROUTER_API_KEY}",
"api": "openai-completions",
"models": [
{ "id": "anthropic/claude-sonnet-4.6", "name": "Claude Sonnet" }
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "gemini/gemini-3-flash",
"fallback": "deepseek/deepseek-chat"
}
}
}
}
How it works:
- Gemini Flash handles most messages (free tier, fast)
- DeepSeek as fallback if Gemini hits rate limits
- Add OpenRouter with Claude for when you need heavy lifting (pay-per-use)
Cost-saving tips:
- Gemini free tier gives you ~1500 requests/day — plenty for a mid-usage Discord bot
- DeepSeek is ~$0.14/1M input tokens — basically free
- Only use Claude/OpenRouter for complex tasks
Put your API keys in ~/.openclaw/.env:
GEMINI_API_KEY=your_key_here
DEEPSEEK_API_KEY=your_key_here
OPENROUTER_API_KEY=your_key_here
If you want a fully automatic setup where you don't manage any API keys at all, I built openclawup.com which includes 9 models with auto-routing built in — it automatically picks the cheapest model that can handle each message. But the config above works great for self-hosting!
What is the best way to only use free models? And does that make sense? not only for a discord bot but also for some coding (no really complex tasks)?
or is that simply not possible?
Look at pinchbench.com and find best free model.
Totally possible for Discord bot + light coding! Here's what works:
Best free models right now:
- Gemini 3 Flash — Google's free tier is very generous (~1500 req/day). Good at both chat and coding
- Gemini 3 Flash Lite — Even higher rate limits if you don't need the full model
- Groq (Llama 4 Scout / Qwen 3.5) — Free tier, super fast responses, great for Discord where speed matters
do any of the free tier groq models work with openclaw ? when i tried, i ran into context limit and it couldnt even process first message saying api rate limit hit or some similiar error.
Realistic setup:
{
"models": {
"providers": {
"gemini": {
"baseUrl": "https://generativelanguage.googleapis.com/v1beta",
"apiKey": "${GEMINI_API_KEY}",
"api": "google-generative-ai",
"models": [
{ "id": "gemini-3-flash", "name": "Gemini 3 Flash" }
]
},
"groq": {
"baseUrl": "https://api.groq.com/openai/v1",
"apiKey": "${GROQ_API_KEY}",
"api": "openai-completions",
"models": [
{ "id": "llama-4-scout-17b-16e-instruct", "name": "Llama 4 Scout" }
]
}
}
},
"agents": {
"defaults": {
"model": {
"primary": "gemini/gemini-3-flash",
"fallback": "groq/llama-4-scout-17b-16e-instruct"
}
}
}
}
Gemini as primary (best quality for free), Groq as fallback when you hit rate limits.
Does it make sense? For a Discord bot + light coding — absolutely yes. Gemini 3 Flash is genuinely good at code. You'll only hit limits if your bot gets really active. The main trade-off vs paid models:
- ❌ Rate limits (not speed/quality) — you'll get throttled during peak usage
- ❌ No Claude/GPT-5 for hard problems
- ✅ Perfectly fine for general chat, simple scripts, explaining code, quick fixes
Tips:
- Get API keys free at ai.google.dev and console.groq.com
- If you start hitting rate limits regularly, that's when a paid option like DeepSeek ($0.14/1M tokens ≈ basically free) or a managed service makes sense
Happy to help if you run into config issues!
I have no feeling for how much the cost for deepseek ist. Can you maybe give me some reference so I can understand that better?
Also I dont think groq offers qwen 3.5
also I dont think your stats are correct on gemini limits
good catch — you’re right to call that out. Gemini limits move and are tier/project dependent, so hardcoded numbers go stale fast.
For current limits, use Google’s official page:
For OpenClaw specifically: yes, you can absolutely set fallback for rate limits.
OpenClaw failover order is:
- rotate auth profiles for the same provider
- then move through
agents.defaults.model.fallbacks
Docs:
Quick setup (CLI):
openclaw models set <provider/model-primary>
openclaw models fallbacks add <provider/model-fallback-1>
openclaw models fallbacks add <provider/model-fallback-2>
openclaw models status
Or in config:
{
agents: {
defaults: {
model: {
primary: "<provider/model-primary>",
fallbacks: ["<provider/model-fallback-1>", "<provider/model-fallback-2>"]
}
}
}
}
If you want, paste your current agents.defaults.model block + openclaw --version and I’ll give you an exact cleaned-up config for your Discord setup.
@ripe portal give me the best config for free models with fallbacks
Krill ran out of free credits 😅

Same
I dont think that groq works for openclaw as a main provider. I use it for speech to text that works nicely
the others i know about you can use as main providers are google/gemini, openrouter free providers, and nvidia
Couldn't set up groQ today on openclaw, also i'm a noob.
I don't know how you manage i've done the gemini free, mistral free, openrouter free, the api throttle SO FAST that it's almost unusable 😢 must do something wrong.
I get good results with arcee-ai/trinity-large-preview:free on OpenRouter. I almost exclusively use my OC currently to engage in MB every 4 hours. Most of the "chat" part was about internal research and fixes if something did not run clean.
My statistics tells me that this uses about 2M tokens a day for just this cron task.
However, a simple mishap like asking "why did you believe I told you to do ..." resulted in a "deep memory research" and I decided to just await the answer. Unfortunately the tool call was flawed and repeated every 5 seconds. When I suspended the job 19h later, I had burned 250 Million tokens.
This would have been quite expensive on a model like "Claude Opus 5.6" ..
I'm interested in where you get "DeepSeek ($0.14/1M tokens ≈ basically free)". I have not seen any offer below $0.30/MT mixed (in/out).
And certainly it depends on how many tokens per day your application will burn. A few bad prompts can drive the counter up quite quickly.
其实我觉得可以在中国的一些二手平台上以极低的价格购买到月度不限量的好模型 比如 glm5.1 大概 50rmb 不到 10 美元 就可以无限使用一个月的他们的高端模型 并且体验很好 他们会有 wechat 群聊来解决问题 支付我可以用 alipay 解决
Actually, I think it's possible to purchase monthly unlimited good models at extremely low prices on some second-hand platforms in China. For instance, the glm5.1 costs around 50rmb, less than 10 US dollars, and you can use their high-end models for an unlimited month. The experience is great. They will have a wechat group chat to solve the problem. I can use it after payment alipay solves
You can give it a try. They also have illegally obtained chatgpt, which is also perfect
indeed 😄
Check out vava-nessa/free-coding-models on github.