The default behaviour of vLLM on serverless is --task generate (for normal completions), but it would be awesome if we could use embeddings via --task embed.
Relevant issue: https://github.com/runpod-workers/worker-vllm/issues/170
3 messages · Page 1 of 1 (latest)
The default behaviour of vLLM on serverless is --task generate (for normal completions), but it would be awesome if we could use embeddings via --task embed.
Relevant issue: https://github.com/runpod-workers/worker-vllm/issues/170
KoboldCpp can generate embeddings if you want