Anytime I deploy a container, the build executes at a reasionable time and shows great logs. However, once the build is done and status is changed to "Testing", it will stay in that state for a couple of hours before eventually showing some logs. And it will also say "Tests took longer than 2 hours to complete". Any idea what could be causing this? Thanks!
#Test Takes Long Time to Run
13 messages · Page 1 of 1 (latest)
I've been struggling the same 🙂
I think the tests are starting to run a bit faster today (like within 30 minutes). Lmk if you see the same!
This looks like runtime re-initialization rather than just a slow build. If the pod scales down, restarts, or the process exits between prompts, the entire inference state gets rebuilt. With large GGUF models .
One thing to check is whether the process stays resident between prompts, or if the pod is tearing down after each request. If it’s the latter, the reload cost is unavoidable.
My guess is fewer queued sandboxes or better GPU availability today. it runs quickly when the test phase finally gets a GPU
Yea I think it’s probably better availability for resources
I meant to say start* within 30 mins
there is a very simple fix that I found on github.
just delete .runpod/tests.json file. that's it. though it say test file required, no it's not.
Ah....That makes sense. Deleting tests.json just skips the test phase. helpful for faster startup, but doesn’t change the underlying model load or re init
yes, this is a fix for testing phases that take more than 2 hours