hi guys any suggestions why my bot | Friends of the Crustacean 🦞🤝 | Page 1

flint oasis Apr 26, 2026, 10:07 PM

#

Can you tell me more?

midnight harness Apr 26, 2026, 10:07 PM

#

so basically i was just trying to tell claw to fix telegram

#

because i was not able to write in telegram

flint oasis Apr 26, 2026, 10:08 PM

#

and now he isnt responding at all?

#

if you can give me any Terminal logs i can help you track it down

midnight harness Apr 26, 2026, 10:08 PM

#

well now it responded. i will write here if i see any problems again. i am just curious why it can be laggy if I am self hosting on rtx 5090 with 9950x3d

flint oasis Apr 26, 2026, 10:09 PM

#

damn thats a hella of a setup, you should be good, you running local models?

midnight harness Apr 26, 2026, 10:09 PM

#

i have a business that i want to integrate ai model

#

i do not understand what i am doing

#

at all

#

i installed qwen3.5:9b

#

i just fixed telegram (hopefully)

#

this thing is cooking my gpu haha

flint oasis Apr 26, 2026, 10:11 PM

#

wierd, qwen3.5 9b should run smoothly, i run that as my local model myself, on an RX 580 Saphirre, 12 year old GPU lol

#

openclaw is harsh for beginners

midnight harness Apr 26, 2026, 10:11 PM

#

well it just spikes at the moment of prompt

#

but runs well with microsoft flight simulator 2024 in the background

midnight harness Apr 26, 2026, 10:12 PM

#

flint oasis openclaw is harsh for beginners

i just need time to figure how this thing work

flint oasis Apr 26, 2026, 10:13 PM

#

if you want to integrate a model for you business, i do not recommend local models, they are weak as hell, and on your setup you cant run capable local models, your best shot is a MiniMax subscription for a workhorse, and maybe GLM for more complex workflow + a VPS

midnight harness Apr 26, 2026, 10:13 PM

#

so for what claw is great?

flint oasis Apr 26, 2026, 10:13 PM

#

MiniMax subscription is 10$ with 1500 requests/5hrs, which is hugeee, GLM is 18$ per month, with maybe like 300 requests per 5hrs

#

If you got a shit ton of work to dispatch, MiniMax is the way to go

midnight harness Apr 26, 2026, 10:14 PM

#

but what do you mean local models are weak?

flint oasis Apr 26, 2026, 10:14 PM

#

Orrrr, if you got $$$$ you can go with claude

midnight harness Apr 26, 2026, 10:14 PM

#

what does local models do

#

well money is not a problem

flint oasis Apr 26, 2026, 10:15 PM

#

midnight harness but what do you mean local models are weak?

Let's take your Qwen 3.5 9B for example, that 9B mean 9 billion parameters, for comparison, the new Deepseek Model is 1.6 Trillion

#

Parameters are the metric system of AI models training data

midnight harness Apr 26, 2026, 10:15 PM

#

ok wow that is actually impressive

#

but what if i want to train my agent myself

#

teach him

flint oasis Apr 26, 2026, 10:16 PM

#

That being said, Qwen 3.5 9B is "dumb" on professional work, but work good on simple automations

midnight harness Apr 26, 2026, 10:16 PM

#

i guess it is not possible to download that deepseek model

flint oasis Apr 26, 2026, 10:16 PM

#

midnight harness but what if i want to train my agent myself

You take your Qwen3.5 9B and upload it to Unsloth, idk how to guide you further tho, i didnt train any model yet

flint oasis Apr 26, 2026, 10:16 PM

#

midnight harness i guess it is not possible to download that deepseek model

If you got like 300GB RAM and like 2 Nvidia H200 maybe yes

midnight harness Apr 26, 2026, 10:17 PM

#

well i can

#

how to download it

flint oasis Apr 26, 2026, 10:17 PM

#

So you got the cash, i envy you haha, i also set Openclaw for my business but im kinda broke atm, trynna make some cash for upgrades, wait a sec to search for the deepseek link

#

https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash

#

https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro

#

V4 Flash is 238B parameters, V4 Pro is the one with 1.6T

midnight harness Apr 26, 2026, 10:21 PM

#

is 4 5090 enough

flint oasis Apr 26, 2026, 10:21 PM

#

The model is 800gb

midnight harness Apr 26, 2026, 10:22 PM

#

i assume no

flint oasis Apr 26, 2026, 10:22 PM

#

And as my Sonnet said

DeepSeek V4 Pro — Hardware Requirements
GPUs:

Minimum viable: 8× H100 80GB (640GB VRAM total)
Comfortable: 8× H200 141GB (1.1TB VRAM total)
Consumer hardware (even 2× RTX 5090) is not enough

System RAM:

~1TB fast RAM for hybrid CPU/GPU offload (and even then, expect slow inference)

Why so much?
V4 Pro is 1.6T total parameters. At Q4 quantization you're still looking at ~800GB just for weights, before KV cache. It's a server cluster story, not a workstation story.
Realistic alternatives:

V4 Flash — 284B params, fits on a single H200 node (~158GB), delivers ~85-95% of Pro quality
DeepSeek API — $1.74 in / $3.48 out per 1M tokens, OpenAI-compatible, just swap base URL
Ollama cloud — ollama run deepseek-v4-flash:cloud for quick testing

Bottom line: Unless you're processing 200M+ tokens/day, the API will always be cheaper than self-hosting V4 Pro. The hardware alone runs $200,000–$330,000+.

#

Or for the V4 Flash

From what we already found:
DeepSeek V4 Flash — Hardware Requirements
Minimum (tight, prototyping only):

2× RTX 4090 (48GB VRAM total) — Q4 quantized, short contexts only, slow

Viable for internal/dev use:

4× RTX 4090 (96GB VRAM) — Q8 possible, reasonable batch sizes, 4-8k context

Comfortable production:

1× H200 141GB — fits the full ~158GB FP4+FP8 checkpoint on a single node

System RAM:

128-256GB DDR5 for smooth CPU↔GPU data movement

Why it's so much easier than Pro:
V4 Flash is 284B total params / 13B active per token. At FP4+FP8 mixed precision it lands at ~158GB — that's single-node territory vs the cluster you need for Pro.
Rough hardware cost:

4× RTX 4090 setup: ~$8,000-10,000
Single H200: ~$35,000-40,000

Cheapest way to test it right now:
bashollama run deepseek-v4-flash:cloud
Or via API at $0.14 in / $0.28 out per 1M tokens — still the most cost-effective option unless you're at serious token volume.

midnight harness Apr 26, 2026, 10:23 PM

#

now i understand why gpu price is so big

flint oasis Apr 26, 2026, 10:23 PM

#

Now that i look at these stats, this is kinda insane lol

#

fking 8 H100

#

Buuuut, you can run the Flash model

midnight harness Apr 26, 2026, 10:23 PM

#

oh i can?

flint oasis Apr 26, 2026, 10:24 PM

#

DeepSeek V4 Flash — Hardware Requirements
Minimum (tight, prototyping only):

2× RTX 4090 (48GB VRAM total) — Q4 quantized, short contexts only, slow

Viable for internal/dev use:

4× RTX 4090 (96GB VRAM) — Q8 possible, reasonable batch sizes, 4-8k context

#

I mean yeah, on 4 RTX 5090 the Q8 should work

midnight harness Apr 26, 2026, 10:26 PM

#

what type of business do you expect me to have

flint oasis Apr 26, 2026, 10:26 PM

#

Q stands for quantization, and here is some more knowledge you should know for running local models

Think of it like audio compression.
A WAV file is uncompressed — every sound sample stored at full precision, massive file size. An MP3 takes that same audio and throws away data your ears can't easily detect, shrinking the file by 10× with barely noticeable quality loss.
Quantization does the same thing to AI model weights. Instead of storing every parameter as a 16-bit or 32-bit float (full precision), you round them down to lower precision — 8-bit, 4-bit, even 2-bit integers. The model gets smaller and faster to run, at the cost of some accuracy.
The common formats you'll see:

FP16 / BF16 — half precision, standard baseline
Q8 — 8-bit, barely any quality loss, ~2× smaller than FP16
Q4_K_M — 4-bit, the sweet spot most people use locally, ~4× smaller, small but noticeable quality drop
Q2 — aggressive compression, fits on weak hardware, meaningful quality degradation

midnight harness Apr 26, 2026, 10:26 PM

#

i need simple email services

flint oasis Apr 26, 2026, 10:27 PM

#

midnight harness what type of business do you expect me to have

idk tbh

midnight harness Apr 26, 2026, 10:27 PM

#

to scrape potential clients

#

is that the model i have is not enough

flint oasis Apr 26, 2026, 10:27 PM

#

I'm in marketing for example

#

For email services as in the AI sending decent written emails to your potential clients?

#

For that you would need a good model for copyrighting, GLM can work and its cheaper than local models

#

and for scraping, maybe try Qwen 3.6 35B

#

That model is insanely good for OpenClaw, or even Gemma 4 31B

midnight harness Apr 26, 2026, 10:29 PM

#

i feel smart after talking to you

flint oasis Apr 26, 2026, 10:29 PM

#

Qwen 3.6 and Gemma 4 are both local models

flint oasis Apr 26, 2026, 10:29 PM

#

midnight harness i feel smart after talking to you

hahaha

midnight harness Apr 26, 2026, 10:29 PM

#

ok so can i just install qwen 3.6 and call it a day

flint oasis Apr 26, 2026, 10:29 PM

#

I'm passionate about AI, a nerd people would say lol

midnight harness Apr 26, 2026, 10:29 PM

#

it took me 4 hours to set up claw
i think 3.6 will be more than enough

flint oasis Apr 26, 2026, 10:29 PM

#

midnight harness ok so can i just install qwen 3.6 and call it a day

basically yeah, let me search a good trained model for ya

midnight harness Apr 26, 2026, 10:30 PM

#

oh so

flint oasis Apr 26, 2026, 10:30 PM

#

should be good for your use case

midnight harness Apr 26, 2026, 10:30 PM

#

wait

#

hahaha

#

there is just a qwen 3.6

#

and some people train it

#

to the direction they need?

flint oasis Apr 26, 2026, 10:30 PM

#

https://huggingface.co/

#

Basically yeah, you can do research on hugging face

#

on that site people are basically uploading their trained models

midnight harness Apr 26, 2026, 10:31 PM

#

that is enough information for me

flint oasis Apr 26, 2026, 10:31 PM

#

some are very good, some are very bad

#hi guys any suggestions why my bot