#Is there a limit in the number of threads?

17 messages · Page 1 of 1 (latest)

short stream
#

I have pods with different numbers of vcpus. I am running vllm. If I create too many vllm in parallel, I get errors like "can't create thread". Is there a parameter that limits the number of threads per pod?

pulsar blazeBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

short stream
#
OpenBLAS blas_thread_init: pthread_create failed for thread 57 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC -1 current, -1 max
OpenBLAS blas_thread_init: pthread_create failed for thread 58 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC -1 current, -1 max
OpenBLAS blas_thread_init: pthread_create failed for thread 59 of 64: Resource temporarily unavailable
OpenBLAS blas_thread_init: RLIMIT_NPROC -1 current, -1 max
OpenBLAS blas_thread_init: pthread_create failed for thread 60 of 64: Resource temporarily unavailable

This is the error I get

rare tree
#

Yes, it has limit, I faced same issue

#

I think I faced this problem at 1024 conc processes (means threads). You can always test it with thread swarming on pod

short stream
#

Thanks. Yeah the problem is that I get it with just 20 vllm in parallel.

What do you mean by thread swarming? Should I just spin off a number of threads to see what the limit is?

dreamy vector
#

20 vllm in paralel? Is it 20 jobs or what

short stream
dreamy vector
#

Oh how did you run it

short stream
#

With vllm serve

dreamy vector
short stream
#

Yes for some pod I can run like 19 in parallel and for some like 28. It is related to the number of vcpu. But I don't understand how it is related.

#

I would be happy to run 64 say in parallel. At some point I am hitting ram and vram limits. But that is OK.

I don't understand why I am hitting multithread limits when there is still ram and vram available.

dreamy vector
short stream
#

Yeah definitely.

But like it should still multithread in time sharing. E.g. Even with 1 vcpu I should be able to get 10 threads. But here it seems that I can get max 2 threads per vcpu

dreamy vector
#

Yeah I'm not sure with this, maybe you should check with a staff