https://github.com/earetaurus/qwen3-tts-runpod
the code works the test just don't I tried a lot of different ones all come back to context deadline exceeded
I even did a hail marry and added a toggle to have every endpoint report 200 even that did not work
am I just writing my test wrong or something
#my tests keep failing on load balancer with context deadline exceeded
4 messages · Page 1 of 1 (latest)
this is my current testfile:
{
"tests": [
{
"name": "openai_compatible_tts",
"input": {
"path": "/v1/audio/speech",
"method": "POST",
"headers": {
"content-type": "application/json"
},
"body": {
"model": "qwen3-tts",
"input": "Hello world!",
"voice": "voice_one",
"response_format": "mp3",
"speed": 1.0
}
},
"timeout": 300000
}
],
"config": {
"gpuTypeId": "NVIDIA A40",
"gpuCount": 1,
"env": [
{ "key": "PORT", "value": "5000" },
{ "key": "VOICE_MAP_URL", "value": "https://raw.githubusercontent.com/earetaurus/qwen3-tts-runpod/main/voicemap/voicemap.json" },
{ "key": "USE_CACHED_MODEL", "value": "0" },
{ "key": "USE_NETWORK_VOLUME", "value": "0" },
{ "key": "MODEL_TYPE", "value": "base" },
{ "key": "MODEL_SIZE", "value": "1.7b" }
],
"allowedCudaVersions": ["12.6", "12.7", "12.8"]
}
}