#After updating I cannot run models that I have run before. It says out of memory.

1 messages · Page 1 of 1 (latest)

coarse hollow
#

I have a rtx 3070 mobile. I have been running 7b exl2 quants at 8bpw with 8192 context with 4 bit cache with no issues but after running the updater, I can no longer run them because it says out of memory

hybrid onyx
#

I too had a similar problem after the exl2 update. Try loading GPTQ model with transformers, it may be slower but requires less VRAM.