Workers stay in Initialize | Runpod | Page 1

fresh panther Jan 4, 2026, 6:15 AM

#

It's been 24 hours and I tried different GPU configurations, all the 24 - 48GB, these specifically: RTX4090 RTX5090 L40 L40S stuck in initializing in multiple(all) data centers. I tried in multiple data centers, but it is the same will all of them.

modest ploverBOT Jan 4, 2026, 6:15 AM

#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

limpid totemBOT Jan 4, 2026, 6:15 AM

#

fresh panther Jan 4, 2026, 6:16 AM

#

Ticket #29446

upper wraith Jan 4, 2026, 3:23 PM

#

fresh panther It's been 24 hours and I tried different GPU configurations, all the 24 - 48GB, ...

if you use another contaiiner image will it work?

#

maybe its trouble on pulling your image

fresh panther Jan 4, 2026, 4:06 PM

#

There are no logs at all, it doesn't even get a chance to pull an image

#

I have created another endpoint and the same thing happens. Treid multiple regions/data centers as well.

upper wraith Jan 5, 2026, 4:12 AM

#

can't be the runpod servers i guess, could be your browser

#

if it were runpod side, many people would storm this server saying the same thing

fresh panther Jan 5, 2026, 5:01 AM

#

It has to be the Runpd server, it can not be my browser. If it is my browser I have serious questions about why using Firefox instead of Chrome is preventing my a worker from starting up correctly.

#

I'll give the support ticket another 24 hours since I opened it duriing the weekend but if I can not use Runpod soon, I'll just have to move to AWS again and pay them 2x. But at this point I have lost faith, I can not use Runpod reliably to build a product for our company. I guess I'd rather pay AWS 2x and know it works consistently. Well, the Ops team will be happy that I'm helping us reach our AWS contract quota haha..

calm jungle Jan 5, 2026, 5:36 AM

#

you are not the only one, it's happening to me too. only for runpod worker templates like vLLM, not my own. there is another poster as well from a few days ago

fresh panther Jan 5, 2026, 11:31 AM

#

Closing the loop here. Support got back to me, there is a problem with the internal registry when you create a new Serverless endpoint it sets the vLLM image as registry.runpod.net/runpod-workers-worker-vllm-main-dockerfile:3851d53f9 and that pulls from an internal Runpod registry. There seems to be some issues with it that they are aware of, so you have to use the DockerHub one: runpod/worker-v1-vllm:v2.11.1 untill they have fixed the issue.

modest ploverBOT Jan 5, 2026, 11:32 AM

#

fresh panther Closing the loop here. Support got back to me, there is a problem with the inter...

Thank you for marking this question as solved!

Learn more

https://answeroverflow.com/about

winter ridge Jan 8, 2026, 9:15 AM

#

Problem stays the same for me. At least now I get log output.

#

2026-01-08T09:05:59Z loading container image from cache
2026-01-08T09:08:43Z Loaded image: runpod/worker-v1-vllm:v2.11.1
2026-01-08T09:08:44Z v2.11.1 Pulling from runpod/worker-v1-vllm
2026-01-08T09:08:44Z Digest: sha256:aa576ece2d76cac59578a8ef4595719423648b7e52eaebb36df0fd2a3e5dfbda
2026-01-08T09:08:44Z Status: Image is up to date for runpod/worker-v1-vllm:v2.11.1
2026-01-08T09:12:28Z create container runpod/worker-v1-vllm:v2.11.1
2026-01-08T09:12:28Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:12:44Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:13:00Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:13:16Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:13:32Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:13:48Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:14:04Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:14:20Z start container for runpod/worker-v1-vllm:v2.11.1: begin
2026-01-08T09:14:35Z remove container

#

Also the worker now just stays in unhealthy state. 🤔 If we know, it is unhealthy, would it not make sense to remove it? 🤓

upper wraith Jan 8, 2026, 10:51 AM

#

winter ridge Also the worker now just stays in `unhealthy` state. 🤔 If we know, it is unhea...

it doess.. but unhealthy here means, its unhealthy because of the worker's code usually, but could be from the host too

usually i think they'll kind of disable that worker for your endpoint?

upper wraith Jan 8, 2026, 11:01 AM

#

winter ridge ``` 2026-01-08T09:05:59Z loading container image from cache 2026-01-08T09:08:43Z...

hmm whats your env variables?, did you check the container logs?

winter ridge Jan 8, 2026, 12:17 PM

#

did not change env variables, took everything from the vllm template. just changed the container path because runpod hub seems broken.

#

#

logfiles don't tell too much. 🤓

upper wraith Jan 8, 2026, 12:49 PM

#

hmm ic

#

open a support ticket then

#

https://contact.runpod.io

upper wraith Jan 8, 2026, 1:06 PM

#

winter ridge

did you use model cache?

winter ridge Jan 8, 2026, 1:07 PM

#

i just specified a model from hugging face. as i understand that should be cached. but it never started even once ... both jobs in the queue for hours.

#Workers stay in Initialize