#Serverless SGLang spent credits on phantom requests

3 messages · Page 1 of 1 (latest)

steel hamlet
#

I deployed a serverless endpoint (id ua6ui6kfksdocn). I tried sending a sample request from the web dash, that one still seems to be in the queue, 20 hours later.

However, looking a the logs, there are lots of requests like this:

2025-04-07  06:36:03.788 | info | b6nvs9f9o5kthk | [2025-04-07 04:36:03] INFO:     127.0.0.1:34746 - "GET /v1/models HTTP/1.1" 401 Unauthorized\n
2025-04-07  06:35:58.778 | info | b6nvs9f9o5kthk | [2025-04-07 04:35:58] INFO:     127.0.0.1:40110 - "GET /v1/models HTTP/1.1" 401 Unauthorized\n
2025-04-07  06:35:53.769 | info | b6nvs9f9o5kthk | [2025-04-07 04:35:53] INFO:     127.0.0.1:40094 - "GET /v1/models HTTP/1.1" 401 Unauthorized\n
2025-04-07  06:35:48.758 | info | b6nvs9f9o5kthk | [2025-04-07 04:35:48] INFO:     127.0.0.1:37338 - "GET /v1/models HTTP/1.1" 401 Unauthorized\n
2025-04-07  06:35:43.748 | info | b6nvs9f9o5kthk | [2025-04-07 04:35:43] INFO:     127.0.0.1:37322 - "GET /v1/models HTTP/1.1" 401 Unauthorized\n
2025-04-07  06:35:38.741 | info | b6nvs9f9o5kthk | [2025-04-07 04:35:38] INFO:     127.0.0.1:36914 ```

I'm assuming that's what kept the workers alive, spending the credits in vain.

I'm assuming the addresses in the log are source addresses of the request - would that be some runpod process trying to get the list of models?

Any clue on how to resolve this and prevent it from happening in the future?
hasty warrenBOT
#

To help others find answers, you can mark your question as solved via Right click solution message -> Apps -> ✅ Mark Solution

supple oar