#After updating I cannot run models that I have run before. It says out of memory.

1 messages · Page 1 of 1 (latest)

coarse hollow Apr 1, 2024, 4:47 PM

I have a rtx 3070 mobile. I have been running 7b exl2 quants at 8bpw with 8192 context with 4 bit cache with no issues but after running the updater, I can no longer run them because it says out of memory

hybrid onyx Apr 3, 2024, 6:14 AM

I too had a similar problem after the exl2 update. Try loading GPTQ model with transformers, it may be slower but requires less VRAM.