#Can't run a 70b model, gets stuck.
1 messages · Page 1 of 1 (latest)
Do you know what this does?
--max-model-len
Model context length. If unspecified, will be automatically derived from the model config.
was urs just getting stuck?
clear the queue and launch again. sometimes it helps at start.
can't know why but i did that and it seems helpful
did u deploy llama 3.1 70b too?
yep one version of it
which one?
if you want llama vanilla version, it's cheaper to run it via groq
which one did u run?
mlabonne/Llama-3.1-70B-Instruct-lorablated
and now i m trying
neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8
i need a version where i can use image+text input
which one do u think would be best
@rugged mason it still got stuck after this: 2024-08-08 19:52:16.529
[l1foz24ligz9rd]
[info]
[1;36m(VllmWorkerProcess pid=100)[0;0m INFO 08-08 14:52:16 weight_utils.py:223] Using model weights format ['*.safetensors']
@rugged mason My sentencec also gets cutoff do you know how to fix that
not really, i'm not runpod expert 😦
ah alr