Hermes with local AI on a modest computer | Nous Research | Page 1

#Hermes with local AI on a modest computer

6 messages · Page 1 of 1 (latest)

trail jay Apr 2, 2026, 7:10 PM

Hi everyone,

I'm running Hermes Agent locally with LM Studio using the model qwen/qwen3.5-35b-a3b on an RTX 4060 Ti 8GB. The model is working well but is quite slow (~10 tokens/s), which is totally fine for me.The problem is that Hermes keeps throwing ReadTimeout errors after only 2-3 minutes, even though I've set api.timeout: 1800 (and higher) in the config. I want Hermes to simply wait as long as necessary for the local model to respond, without any timeout. Speed is not important — I just want it to be patient and never give up while the model is still generating.

Has anyone managed to completely disable or significantly increase the timeouts when using a slow local provider like LM Studio?

Thanks in advance!

quiet dew Apr 2, 2026, 7:41 PM

let me know if this helps. i'm just asking code in an ide on this one.

trail jay Apr 2, 2026, 8:22 PM

I'll test this and come back here to give my results, thank you for your help!

quiet dew Apr 2, 2026, 8:33 PM

thanks! please do let me know!

trail jay Apr 2, 2026, 10:13 PM

I confirm that I no longer have timeouts on my tests, thank you very much, it's great, it seems to work perfectly!

quiet dew Apr 2, 2026, 10:47 PM

awesome! glad to hear!