4bit model creation | Unsloth AI | Page 1

jaunty saddle May 27, 2024, 9:21 PM

#

Hi! Is there an easy way to convert any model to 4bit nf4 bnb for unsloth, to save space?

I want to try out some Llama 3 and Mistral finetunes that aren't provided and I'd like to not have to store the full precision models on disk.

I know I could write some code to use the bnb and the safetensors libs to do it, but I'd prefer something easier / more plug and play, that preferably also wouldn't require changes to unsloth code itself.

If not supported, I think it'd be a neat feature to have in unsloth to save those to disk after converting (and replace the full precision model), via a function or an optional argument or whatever.
Perhaps I could contribute it myself with some direction.

Thanks!

chrome root May 28, 2024, 1:17 PM

#

wait as in once the model finishes downloading, remove the 16bit model

#

and keep the 4bit one?

jaunty saddle May 29, 2024, 6:33 PM

#

chrome root and keep the 4bit one?

yep

#

makes sense no? me and I assume others might only ever want to train 4bit qloras (hw constraints being a common reason). no reason to waste many GBs of disk space (& have them load slower I assume)

jaunty saddle May 30, 2024, 2:21 AM

#

there are Q4 quantized models on hf but these are GGUF which I don't think unsloth can read

chrome root May 30, 2024, 8:23 AM

#

ye sadly GGUF doesnt work yet

#4bit model creation