#Stuck on "Not Ready" - Skill issue?

55 messages · Page 1 of 1 (latest)

charred ice
#

Hi all, I'm new to the forum but love RunPod - great product. Currently trying to deploy a production service and require a bit of guidance.

I have a custom worker running FastAPI on Port 80 (health port 80) - I did this by letting it default.

I can see from the worker logs that the FastAPI does boot. I can even see /ping being received with a 200 response code.

However I can't ping it myself, or access any custom endpoints. The services seem to be stuck on "Not Ready" and workers reboot after a few minutes (roughly 10) .

I suspect skill issue!

Details:

If you have any ideas on what it could be, I'd greatly appreciate any assistance.

Have a great day.

GitHub

A Runpod worker template for load balancing Serverless endpoints. - runpod-workers/worker-load-balancing

#

Stuck on "Not Ready" - Skill issue?

feral crypt
#

Whats not ready like?

charred ice
#

My apologies. The HTTP service.

#

It never turns green. The same image deployed on a Pod works fine.

feral crypt
#

not sure if that has the connection with how load balancing work, but how the "http services" check work i think it pings on the / path

#

How are you using the endpoint? (lb endpoint)

charred ice
#

Thank you for your help! Unfortunately even deploying the example linked in that article lead to the same behaviour for me earlier today. I suppose I should probably open a ticket. Any further thoughts are welcome. 🙏

feral crypt
#

What are you getting hitting /generate?

charred ice
#

It just hangs. I see the request in the metrics tab on the requests history as a failed request.

feral crypt
#

Waait a second

astral aspen
#

default port is 80, use PORT env variable to specify any custom port

feral crypt
#

Wait what how come it worked with me with env port=5000 just now
OH it supposed to be 5000 right?

#

But it takes a long time

astral aspen
#

whats the PORT env variable?

charred ice
# astral aspen whats the PORT env variable?

Good afternoon 🙏.

Regardless of port variable we set we unfortunately see no improvement. Is it possible it HAS to be port 8888? - we haven’t tried that specific one yet. The /ping requests are however being logged by FastAPI.

astral aspen
#

ping your endpoint id, ill check logs

astral aspen
#

just an update, the port is 80 by default, ignore the 8888 part

feral crypt
astral aspen
#

that should work then

feral crypt
#

menuw8uvzgfi2z

#

yeah is it normal to be this slow?

#

it takes like 20secs + to get a response

astral aspen
#

how much slower is it compared to queue?

feral crypt
#

can you see my previous requests?

#

it was just a test endpoint and test image

feral crypt
feral crypt
astral aspen
#

your mainly talking about cold start being slower right?

feral crypt
#

maybe its cold start, but there was no model loading in my lb endpoint

astral aspen
#

what does it do? just run the http server?

feral crypt
#

yes the example repo

astral aspen
#

interesting will raise this up, thats not ideal at all

feral crypt
#

hmm 1min and still running (the request)

#

but the worker isnt running

astral aspen
#

@sudden egret we need to raise this and also test to see if our examples are this slow

sudden egret
#

Looking into it, mostly just curious why Jasons took so long to wake up.

feral crypt
sudden egret
#

These RT endpoints don't have the same logging for me, but I can imagine a part of the delay a lot of new Load Balancer users will see is from downloading a new Docker Image onto every server. I'll test the template we publish

astral aspen
#

An update to the original issue:
Ignore the workers port 80 services blob and saying its not ready, we plan to remove that since it doesn't apply and creating extra confusion.
The proper url to use for LB endpoints is to click Quick Start in overview tab and it will give you details on the url.

sudden egret
#

Every once in a while it takes an awkward amount of time to do... nothing.

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #23)"} 0.409 total

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #24)"} 5.376 total

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #25)"} 3.662 total

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #26)"} 0.310 total

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #27)"} 0.390 total

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #28)"} 0.295 total

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #29)"} 0.307 total

❯ time curl -X POST "https://zvshz29pnyzitk.api.runpod.ai/generate" -H "Authorization: Bearer $RUNPOD_API_KEY" -H "Content-Type: application/json" -d '{"prompt": "Hello, world!"}'
{"generated_text":"Response to: Hello, world! (request #30)"} 0.308 total
#

But about 307ms is the average response time.

#

If I go to spam it, I can get pretty consistently low request times but if I pause to edit that output or write a message here consistently the same 1 worker goes back to 3-5 second response times.,

#

For example, writing that then running immediately again - 9 seconds

#

Logs and the incrementing request number show its all the same one worker

#

And our example is broken, there is nothing running on port 5001

astral aspen
#

that 3-5s, does it occur because the worker is not running anymore and has become idle?

sudden egret
#

Kind of hard to tell as an end user but it seems like it

feral crypt
#

so the long delay to respond is because of my network connection

feral crypt
#
root@ubuntu-s-1vcpu-2gb-sgp1-01:~# time curl --location 'https://menuw8uvzgfi2z.api.runpod.ai/generate' --header 'Content-Type: application/json' --header 'Authorization: Bearer well_my_api_key' --data '{"prompt": "Hello, world!"}'
{"error":"no workers available"}
real    0m39.378s
user    0m0.060s
sys     0m0.015s
#

im using do vps

astral aspen
#

listen for that type of status code, retry when it occurs, that means the proxy waited 2 mins to find a worker but none has scaled up yet