Recommended tips to mitigate VRAM usage? | SwarmUI | Page 1

main fiber Dec 18, 2024, 12:09 PM

#

I recently bought a 3090 24GB, but seem to still be running out of vram. I am using a quantized t5 as well as ae, clip-l, and clip-g, and i'm using about 18 GB of vram. Any recommendations to mitigate the vram usage? As soon as i throw in a 2 GB lora, i'm getting OOM errors

azure cloud Dec 18, 2024, 12:17 PM

#

messing with the text encoders and vae is barking up the wrong tree

#

don't mess with them, leave them default

#

you want to swap the main model itself (presumably a Flux model from what you're describing) for a smaller variant

main fiber Dec 18, 2024, 12:39 PM

#

the model i made is a kohya dreambooth model of myself, is it possible to quantize it to make it smaller?

azure cloud Dec 18, 2024, 12:42 PM

#

main fiber the model i made is a kohya dreambooth model of myself, is it possible to quanti...

well, A, if it's a flux small train run use a lora not a full checkpoint omg

main fiber Dec 18, 2024, 12:43 PM

#

yeah i could do that, but th elora doesn't really captures styles nearly as well as the dreambooth version

azure cloud Dec 18, 2024, 12:43 PM

#

B there's gguf quant scripts in https://github.com/city96/ComfyUI-GGUF/tree/main/tools

GitHub

ComfyUI-GGUF/tools at main · city96/ComfyUI-GGUF

GGUF Quantization support for native ComfyUI models - city96/ComfyUI-GGUF

main fiber Dec 18, 2024, 12:44 PM

#

Thank you

main fiber Dec 18, 2024, 1:54 PM

#

ok soi quantized my model, but when i try to load it, i'm getting "All available backends failed to load the model '/mnt/Kodi_Backup/Applications/StableSwarmUI/SwarmUI/Models/diffusion_models/Flux/Zono/Zono_1-000150-Q4_K_S.gguf'."

azure cloud Dec 18, 2024, 1:56 PM

#

post the output of server->logs->pastebin button

main fiber Dec 18, 2024, 1:58 PM

#

https://paste.denizenscript.com/View/129159

SwarmUI v0.9.4.0 Server Log - 2024-12-18 08:58:26 | Paste #129159 |...

Content of Swarm Debug Log Paste #129159: SwarmUI v0.9.4.0 Server Log - 2024-12-18 08:58:26... pasted 2024/12/18 05:58:27 UTC-08:00, Paste length: 91741 characters across 738 lines, Content: 2024-12-18 08:52:16.656 [Init] === SwarmUI v0.9.4.0 Starting at 2024-12-18 08:52:16 ===2024-12-18 08...

#

i think i got it, i remember a long time ago with gguf you had to set the metadata for it to flux.dev

azure cloud Dec 18, 2024, 2:01 PM

#

yes

#

which is what it tells you there backend #0 failed to load model with error: Model loader for Flux/Zono/Zono_1-000150-Q4_K_S.gguf didn't work - architecture ID is missing. Please click Edit Metadata on the model and apply a valid architecture ID.

main fiber Dec 18, 2024, 2:04 PM

#

yeah the quantization didn't work, getting a black box during inference time. i'm also using arch linux, i did find some commands that did convert to fp16 and then to fp8, but not working inference wise. I'll keep looking

#Recommended tips to mitigate VRAM usage?