Help on Choosing a Model for a Prompt-Processing. | Unsloth AI | Page 1

orchid plover May 9, 2025, 4:03 PM

#

Help on Choosing a Model for a Prompt-Processing.

vale lion May 9, 2025, 4:08 PM

#

that's over 1 request per second?

even with a highly optimized setup, that is not possible within a single machine, you'll need to scale hozirontally (setup multiple instances running in parallel)

For the validator I'd try something like Qwen3 4B with reasoning enabled first

For the analyzer you'll need to use structured generation regardless of what else you try, but I highly suggest trying to simplify the data model first
Which model you use for it probably doesn't matters as much though

For both cases, try to get it working with only prompt engineering first, then once you identify the models limitations and failure cases fine-tune to cover up for it

orchid plover May 9, 2025, 4:11 PM

#

vale lion that's over 1 request per second? even with a highly optimized setup, that is n...

no need to be 1 per seconds,
i need a model to can cover the validator and analyzer, i will run some tests to see if needs to fine tune it or no.

vale lion May 9, 2025, 4:21 PM

#

yes

orchid plover May 9, 2025, 4:32 PM

#

vale lion yes

can i have a hugging face link for this model?

vale lion May 9, 2025, 4:39 PM

#

the main is https://huggingface.co/Qwen/Qwen3-4B but there are also a lot of quantatizations (one offical, a lot of unoffical ones including unsloth's) - your milleage may vary depending on which quant you pick though, some of them are poorly supported

Qwen/Qwen3-4B · Hugging Face

vale lion May 9, 2025, 5:11 PM

#

you can try some Guard model

keep in mind that moderating for generic safety and moderating for being on topic are two completely different things though

#Help on Choosing a Model for a Prompt-Processing.