#Can't deploy Qwen/Qwen2.5-14B-Instruct-1M on serverless

6 messages · Page 1 of 1 (latest)

remote sundial
#

Steps to reproduce:

  1. Use Serverless vLLM quick deploy for Qwen/Qwen2.5-14B-Instruct-1M (image attached)
  2. Proceed with default config.
  3. Try and send a request.

Error:

2025-06-18T12:58:36.147823280Z INFO 06-18 12:58:36 [model_runner.py:1170] Starting to load model Qwen/Qwen2.5-14B-Instruct-1M...
2025-06-18T12:58:36.449947523Z engine.py:116  2025-06-18 12:58:36,449 Error initializing vLLM engine: FlashAttentionImpl.__init__() got an unexpected keyword argument 'layer_idx'

How do I fix this?

I've been trying to troubleshoot this all morning. All help appreciated 🙏

brazen slate
#

probably its not on runpod vllm yet

remote sundial
brazen slate
#

i've gven the value there, try to test it again with the envs