#Support vLLM embeddings (Qwen3)

3 messages · Page 1 of 1 (latest)

sweet coral
#

The default behaviour of vLLM on serverless is --task generate (for normal completions), but it would be awesome if we could use embeddings via --task embed.

Relevant issue: https://github.com/runpod-workers/worker-vllm/issues/170

GitHub

Hello! We are trying to use serverless for embedders like BAAI/bge-multilingual-gemma2 that work very well on your pods. For this embedder, one has to pass explicitly the argument --task embed for ...

naive mauveBOT
wanton lava
#

KoboldCpp can generate embeddings if you want