#Help on Choosing a Model for a Prompt-Processing.

7 messages · Page 1 of 1 (latest)

orchid plover
#

Help on Choosing a Model for a Prompt-Processing.

vale lion
#

that's over 1 request per second?

even with a highly optimized setup, that is not possible within a single machine, you'll need to scale hozirontally (setup multiple instances running in parallel)

For the validator I'd try something like Qwen3 4B with reasoning enabled first

For the analyzer you'll need to use structured generation regardless of what else you try, but I highly suggest trying to simplify the data model first
Which model you use for it probably doesn't matters as much though

For both cases, try to get it working with only prompt engineering first, then once you identify the models limitations and failure cases fine-tune to cover up for it

orchid plover
vale lion
#

yes

orchid plover
vale lion
vale lion
#

you can try some Guard model

keep in mind that moderating for generic safety and moderating for being on topic are two completely different things though