#OOM Issue running FLUX1-dev-bnb-nf4 on machines with low VRAM

1 messages · Page 1 of 1 (latest)

solid flax
#

When attempting to run the FLUX1-dev-bnb-nf4 model with the new bitsandbytes_NF4 node, I receive an OOM error message.

ComfyUI execution error: Allocation on device 0 would exceed allowed memory. (out of memory)
Currently allocated : 3.46 GiB
Requested : 13.50 MiB
Device limit : 4.00 GiB
Free (according to CUDA): 0 bytes
PyTorch limit (set by user-supplied memory fraction)
: 17179869184.00 GiB

I can run the model without issues in ForgeUI, and I also didn't have problems with the FP8 version in Swarm (aside from the painfully slow times I get). From what I understand, the issue lies with the bitsandbytes_NF4 node.

I posted this in an already open issue on the node's GitHub. For now, I think we just have to wait for an update or an alternative node.

If anyone has found an alternative solution, I would appreciate the information!

fair ferry
#

The latest updates to ComfyUI have much improved performance for my low-end system, which seems comparable to yours. I even run the t5xxl_fp16, which takes just ~10 seconds for text encoding now. And I no longer need to use the 16 bit DType for normal speeds, which seemed unusual in the first place. 8 bit e4m3fn is no longer extremely slow on my system.

My speeds are now similar to Forge. I still have issues running nf4 though, I can only run those in Forge for now. I suggest you try the normal Flux FP8 unet and see if it isn't as slow as it was.

hybrid light
#

same here got OOM error running FLUX1-dev-bnb-nf4 on swarm, can run in forge perfectly fine.

left leaf
#

make sure swarm and comfy are staying up to date, comfy has been pushing a lot of vram management tweak updates recently

#

also #announcements you might try the new GGUF models, they're potentially a bit better than nf4 anyway