#Maximum queue size

61 messages · Page 1 of 1 (latest)

void mango
#

Hi, is there a limit for maximum pending jobs in the queue with serverless endpoints or are there any other queue size limitations?

buoyant meadow
#

i don't think theres any

red hound
#

There is a limit actually

#

100 * max workers

#

#🎤|general message

buoyant meadow
#

Oohh

void mango
#

thx!

pliant veldt
# red hound 100 * max workers

Ok, max workers = 100. Can I have 1000 API connection in the IN_QUEUE status? It's an exaggerated example but would like to know how many actual IN_QUEUE endpoint calls can be waiting? Say I have a serverless endpoint that receives 1000 calls but active workers is set to 0 and max workers is set to 1 and the process takes ~ 5 minutes. Can it process all 1000 calls, one at a time, while the remaining calls remain in the QUEUE?

red hound
#

Read the link to flash-singh's message above, TTL also comes into play.

pliant veldt
#

Ah so it all comes down to 'Idle Timeout' setting? What is the largest value the 'Idle Timeout' can use? 86,400 seconds would be 24 hours.

red hound
#

No it doesn't

#

if You have only a single max worker, you can only have 100 requests in the queue

#

simple mathematics

pliant veldt
#

OK so with 1 active worker I can have a maximum of 99 IN_QUEUE connections waiting?

red hound
#

You should never have only 1 max worker under any circumstances anyway

pliant veldt
#

with 1 max worker I don't spend any $ until my endpoint is being used.

red hound
#

I am not sure how the formula works with active workers, I assume the formula actually counts all workers not just max workers

#

@crimson canyon would need to confirm though because I am just guessing, and @golden trench should probably also document this info somewhere.

pliant veldt
#

Ok thanks for the info you provided. Hopefully one of the people you have tagged will chime in.

red hound
#

Probably need them both to chime in because this question about how many max requests can be in queue has come up more than once so will be great if it can be documented.

pliant veldt
#

I would hope that items can remain in the QUEUE for some time. All they are tiny amounts of information (JSON). Not sure why they would want to time them out.

buoyant meadow
#

Wdym

red hound
#

Its already been made clear that they don't, and it depends on your workers

#

Why would you want thousands of requests in the queue, that makes no sense

#

Usually when you have a large number of requests in the queue, you need to add more max workers to process the requests, or else you have some issue with your handler

buoyant meadow
#

Yeap, and if your request takes a long time to process you can use pods btw

red hound
#

Even if it takes long, you can still use serverless, just make sure you don't try and use 1 max worker

#

I want to rip my eyes out when people have 1 max worker and complain that things are not working as expected

buoyant meadow
#

Chill

red hound
#

I honestly don't get it, its not like you pay for them like active workers, so why set it to 1

#

And if you set it higher than 1, RunPod also gives you FREE "additional workers" to help with throttling

#

So there is absolutely no reason whatsoever to ever set it to 1. I don't even set it to 1 for debugging.

pliant veldt
# red hound Its already been made clear that they don't, and it depends on your workers

I agree completely and I would do exactly that once I start getting users but before that I would like to save $ and not have active workers running when I have 0 users. But to understand what is possible in that initial phase I would need to know how many I can have in QUEUE. I would NEVER let there be 1000 in QUEUE but I could imagine a time where there is on average 10 or so in QUEUE.

red hound
#

I use network storage and sometimes there are networking issues and weird other incidents in those data centers.

pliant veldt
#

With 0 active workers and 1 max worker what happens is nothing runs until an endpoint is hit. Once that happens a new serverless is spun up and responds. After that it goes back idle. If I had 1 active worker I would be charged for that endpoint sitting there waiting for requests right?

red hound
#

Please dont ever set max workers to 1.

#

I have been trying to make this clear in all my messages above, just DON'T do it.

pliant veldt
#

Other than wanting me to pay more what is the reason?

buoyant meadow
#

Yeah bro just use more max workers, if you want you can set the scale type to the other one ( not queue delay )

red hound
#

Pay what more?

buoyant meadow
#

It doesn't make you pay more

red hound
#

Max workers are free

#

RunPod sets the default to 3 for a reason

buoyant meadow
#

Serverless charges from your running time only

red hound
#

They actually shouldn't allow you to set it less than 3 in my opinion.

pliant veldt
#

Max workers, yes! I see. Could have 0 active workers and max of 3. That makes more sense for prod. That way I can handle the connections and again only being charged when used.

buoyant meadow
#

Yeah..

red hound
#

Exactly, like I said, I have 0 active workers and 30 max workers in production.

#

I don't pay a cent for active workers, I only pay when my max workers kick in and handle requests.

pliant veldt
#

Yes, I mis understood what you were saying. Right now it is just me setting up/testing so I never need more than 1 but 30 sounds good for max workers in production.

red hound
#

3 workers is also very few, if your app goes viral or something, you will have issues.

pliant veldt
#

That really sheds light on the subject. With max workers set higher it really removes the concern, for me at least, about how many items can remain in QUEUE.

buoyant meadow
#

Btw if you wanna make sure, just add an extra storage that stores the information of jobs

dusky burrow
#

It sounds like the confusion is over terms:
"Running" vs "Idle" -> A worker only costs while it is Running.
"Active" vs "Max" -> An active worker is "Always on shift" and so effectively always Running, but costs 40% less. Max workers are how many total might possibly be brought in to work. Max minus Active = Temp Workers, and they also are not costing anything unless they are Running.
When there is nothing to process - no queue at all - there is no worker Running, so no cost for the worker(s).
When the queue has ANYTHING in it, a worker will run - and cost money - to process the next thing in queue, up to the max number of workers.
If you intend to have a non-empty queue at all times, you should have enough "Active" workers to handle the normal load of the queue and cost the least. Then bigger loads will pull in "Temp workers" up to the Max count to handle the queue faster until it goes down.

fast sluice
#

Hi, sorry to bring up this thread again but... I see in the documentation that the max queue size is defined as:
Queue size exceeds 50 jobs AND
Queue size exceeds endpoint.WorkersMax * 500.

How can it be both 50 and 500𝑥?
Is 50 just a safety measure in case WorkersMax is set to 0?

https://docs.runpod.io/serverless/references/operations#queue-limits

buoyant meadow
#

I guess, what problem do you experience with this?

fast sluice
jovial reef
#

Is 50 just a safety measure in case WorkersMax is set to 0?
Correct