Hello, I am on linux and I have been able to get the models fine tuned, but I have not been able to quantize the safetensors. Whenever I try to I get the following error:
The output location will be ./model/unsloth.BF16.gguf
This will take 3 minutes...
/bin/sh: 1: python: not found
Traceback (most recent call last):
File "/home/alowtron/Documents/Coding/AI/Test1.py", line 122, in <module>
if True: model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")
File "/home/alowtron/.local/lib/python3.10/site-packages/unsloth/save.py", line 1630, in unsloth_save_pretrained_gguf
all_file_locations = save_to_gguf(model_type, model_dtype, is_sentencepiece_model,
File "/home/alowtron/.local/lib/python3.10/site-packages/unsloth/save.py", line 1111, in save_to_gguf
raise RuntimeError(
RuntimeError: Unsloth: Quantization failed for ./model/unsloth.BF16.gguf
You might have to compile llama.cpp yourself, then run this again.
You do not need to close this Python program. Run the following commands in a new terminal:
You must run this in the same folder as you're saving your model.
git clone --recursive https://github.com/ggerganov/llama.cpp
cd llama.cpp && make clean && make all -j
Once that's done, redo the quantization.
I go in and I run the commands in the output folder, but when I try to quatize again I get the same error