#The total token limit at 131
1 messages · Page 1 of 1 (latest)
Hi, you need to use the right input for max output token
{
"input": {
"messages": [
{
"role": "system",
"content": "Your are an ai assistant."
},
{
"role": "user",
"content": "Explain llm models"
}
],
"sampling_params": {
"max_tokens": 3000,
"temperature": 0.7,
"top_p": 0.95,
"n": 1,
"stream": false,
"stop": [],
"presence_penalty": 0,
"frequency_penalty": 0,
"logit_bias": {},
"best_of": 1
}
}
}
Something like this, feel free to modify it
It should be the max tokens inside the sampling params
does runpod do json schema validation
why did that invalid JSON not cause an error
Which?
Oh for vllm's unknown input right?
Hmm yea interesting does runpod checks them
This 4xx is http code?
I see yeah should be that way
essentially its your fault
5xx: uh oh i messed up
3xx: go somewhere else
Actually, i checked i don't see any validate() calls in vllm worker
that's unfortunate