#Can't run a 70b model, gets stuck.

1 messages · Page 1 of 1 (latest)

rugged mason
#

you seems to have the same issue I got, try to add in your environment variables

paper perch
#

Do you know what this does?

rugged mason
#

--max-model-len
Model context length. If unspecified, will be automatically derived from the model config.

paper perch
#

was urs just getting stuck?

rugged mason
#

yep. i fixed with that

#

reading my log and it's the only change i made

paper perch
#

also my worker is ready but it is stuck in queue

#

do u why that can be

rugged mason
#

clear the queue and launch again. sometimes it helps at start.
can't know why but i did that and it seems helpful

paper perch
#

did u deploy llama 3.1 70b too?

rugged mason
#

yep one version of it

paper perch
#

which one?

rugged mason
#

if you want llama vanilla version, it's cheaper to run it via groq

paper perch
#

which one did u run?

rugged mason
#

mlabonne/Llama-3.1-70B-Instruct-lorablated
and now i m trying
neuralmagic/Meta-Llama-3.1-70B-Instruct-FP8

paper perch
#

i need a version where i can use image+text input

#

which one do u think would be best

#

@rugged mason it still got stuck after this: 2024-08-08 19:52:16.529
[l1foz24ligz9rd]
[info]
(VllmWorkerProcess pid=100) INFO 08-08 14:52:16 weight_utils.py:223] Using model weights format ['*.safetensors']

#

@rugged mason My sentencec also gets cutoff do you know how to fix that

rugged mason
paper perch
#

ah alr