#Mistral MOE loading?

1 messages · Page 1 of 1 (latest)

patent wind
#

Anyone know how to get this model to load or what this error means. I think the model should fit on my 3090 since its only 20.1 GB?

near hatch
#

update to latest ooba and use llama.cpp (gguf) or autogptq instead

patent wind
#

Yeah Gguf is so slow though. I was hoping to get qptq working for some of the 3bit models since they are less than 24 GB. Im guessing they take more than that to hold them in memory?

frigid chasm