im running openclaw local with ollama local model qwen2.5-coder:3b when doing the ollama run qwen2.5-coder:3b i can ask questions and it responds inmeaditly but when i add it to openclaw and ask something it keeps giving this error: 500 {"error":"model requires more system memory (19.9 GiB) than is available (13.9 GiB)"} and in the terminal it gives this: (check photo)
#ollama
1 messages Β· Page 1 of 1 (latest)
nvm cant add photo
15:34:00+02:00 [agent/embedded] Profile ollama:default timed out. Trying next account...
15:34:00+02:00 [agent/embedded] embedded run failover decision: runId=fc38064f-3d50-4b0c-9d35-755292f9bf63 stage=assistant decision=surface_error reason=timeout provider=ollama/llama3.1:8b profile=sha256:9c018ec112cf
and the weird thing is i dont even have llama3.1:8b installed...
@left geode
on your machine running ollama type ollama ps
ollama ps
NAME ID SIZE PROCESSOR CONTEXT UNTIL
how much vram do you have? 32gb?
idk
i just got a computer not a good pc or something
that isn't very helpful lol... go to your task manager (ctrl+alt+del, or right click on your task bar) then performance tab, then pick your gpu
4gb
π
you need a model that is smaller than 4gb then, it's bleeding into your system memory which will be a bad experience
but why would it work good when doing ollama run?
that's aside from the not having enough, but even if we figure that out, using system memory is going to brick your experience anyway
but not openclaw
no idea tbh, my assumption is you have something running that's taking up too much of your system memory
Make an ollama account. Pull a cloud model, reload openclaw using the cloud model.
No longer dependent on your system for inference.
Also π my earlier points solved your problem
i keep getting rate limited
thats why i went to local
you can find a model that'll fit in your 4gb, it's just not going to be a great experience
Nemotron-3-super:cloud has no limits right now
Completely free while its in preview
Openrouter is a good option for free models and decent rates..
But for local runs on low specs you have to choose between slow or stupid... I do stupid with offload to cloud models
i did openrouter as main and gemini as fallback but still getting rate limited fast
Try out nemotron-3-super through ollama
alr will it get rate limited once?
like when its not preview anymore
yes
100%
use it while you can though, cuz why not
and then just search for a new one?
Yup