#Which LLM to choose now?

1 messages · Page 1 of 1 (latest)

rustic kelp
#

So yesterday i tried setting up my assist with Google Gemini API. STT, TTS and Conversation worked amazingly! But recently, Google has crippled the usage limits for free plan. Which model could i use instead? What do you recommend? I also would like a good TTS and STT. Whisper is fine i think, but Piper TTS (German voices) really doesnt fit me.

What can y'all recommend i try? Gemini was amazing, but is paid. Maybe it's still worth it?

red knot
#

If you have at least 8gb vram, I'm running qwen3-4b-instruct and it's very fast

plucky idol
#

You get 250€ of credits with Gemini to see how much it would cost to run it full-time. Probably worth experimenting if you liked it

fading sun
#

Try Ollama with free acc and GPT-OSS.

elder hazel
#

I use oss 120b for my daily driver and it’s great and free

rustic kelp
elder hazel
rustic kelp
#

For those who see this later:
Set up GPT-OSS-120B through Ollama Cloud and STT/TTS through Microsoft Azure.

Both have good free tiers. Ollama has a large weekly usage limit and STT/TTS has 5 hrs/500K characters per month with no billing. Isn't local, but works very fast.

signal harbor
#

If you don’t care about running it local, you can add $10 to any of the major AI providers and likely not run though that unless you have a truly massive number of entities…

red knot
#

You legit don't need anything higher than 4b for the assistant

fading sun
red knot
fading sun
red knot
#

I use the mcp server someone here made

#

Sec

elder hazel
#

I’m using openclaw and it’s freaking amazing. I let it do what would have taken me 30 minutes and it did it in 1 minute. I am using GPT so internet required but now I have it setup I can use a local AI and I can chat with it in WhatsApp tell it what I want and it’s done.

fading sun
fading sun
elder hazel
#

I haven’t had hallucinations in awhile

fading sun
fading sun
red knot
#

try the mcp package

fading sun
#

Pretty fast with answer to "hello" though, that's helpful 🙂

fading sun
red knot
elder hazel
# fading sun In 4B model?

No, I do have the 4b and a 30b qwen model but until I lose internet the GPT OSS 120b is my daily driver and it’s free and lightning fast. Faster than Siri or Alexa.

fading sun
hybrid totem
#

I am using llama-server with gpt-oss-20b.gguf locally with the mike-nott/mcp-assist; 3090 handles everything very well.

Llama-server uses about 16GB VRAM with a 65K context pipe for llama-server. Its been really reliable and seldom hallucinates. For Home Assistant comnands intents are executed immediately, AI multiple HA commands take ~ 1-2 seconds. For an MCP search like how did the market close today it usually takes 4-5 seconds to get a response.

signal harbor
# fading sun In 4B model?

Newer models are trained with longer contexts and that reduces the hallucinations. I got Qwen3.5 4B and 9B running. You have to configure the settings for tool calling.

strong grove
#

Oh I'm currently using qwen2.5-7b-instruct with Think before Responding off.
Didnt realize 4b is enough
When you say "You have to configure the settings for tool calling.", is that the Assist checkbox?

quasi epoch
# elder hazel I’m using openclaw and it’s freaking amazing. I let it do what would have taken ...

Hey ChrisC just started using OpenClaw to run my HA with Claude as the AI, and my junior minded self used the most powerful model I could because I’m an immature married 37 year old with two daughters and could not help myself. After only two days of use I realized 1) This thing is amazingly powerful as it created all of my automations for me in a matter of minutes 2) I cannot afford this.

My question is, you referenced ChatGPT. Was that your solution? Using the free model of ChatGPT? I saw you reference it somewhere in this conversation. Which model specifically? And how has it been for you with really complex automations? Is it something you’d let run 24/7, because as of now I only run OpenClaw to build automations for me and fix errors.

Thanks for any info.

elder hazel
quasi epoch
elder hazel
#

I don’t use Ollama with OpenClaw, so far I’m only using GPT 5.4 with Opency. I use Ollama with GPT OSS 120b Cloud for conversation because it’s free and excellent.

signal harbor