Llama 3.1:8b notebook error | Unsloth AI | Page 1

fresh briar Aug 12, 2025, 7:02 PM

#

Suddenly seeing this error in the classic Llama3.1_(8B)-Alpaca.ipynb notebook. I'm on a L4 instance and uploaded a json dataset for finetuning. This process worked in the past week but suddenly isn't today. I've run the recommended code and tried to re-quantize but no dice. Even fixed the code to use CMake instead and still no dice. Suggestions?

This might take 3 minutes...
Traceback (most recent call last):
File "/content/llama.cpp/convert_hf_to_gguf.py", line 32, in <module>
from mistral_common.tokens.tokenizers.base import TokenizerVersion
ModuleNotFoundError: No module named 'mistral_common'

RuntimeError Traceback (most recent call last)
/tmp/ipython-input-2419798222.py in <cell line: 0>()
10
11 # Save to q4_k_m GGUF
---> 12 if True: model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")
13 if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "q4_k_m", token = "")
14

1 frames
/usr/local/lib/python3.11/dist-packages/unsloth/save.py in save_to_gguf(model_type, model_dtype, is_sentencepiece, model_directory, quantization_method, first_conversion, _run_installer)
1218 )
1219 else:
-> 1220 raise RuntimeError(
1221 f"Unsloth: Quantization failed for {final_location}\n"
1222 "You might have to compile llama.cpp yourself, then run this again.\n"\

RuntimeError: Unsloth: Quantization failed for /content/model/unsloth.BF16.gguf
You might have to compile llama.cpp yourself, then run this again.
You do not need to close this Python program. Run the following commands in a new terminal:
You must run this in the same folder as you're saving your model.
git clone --recursive https://github.com/ggerganov/llama.cpp
cd llama.cpp && make clean && make all -j
Once that's done, redo the quantization.

GitHub

GitHub - ggml-org/llama.cpp: LLM inference in C/C++

LLM inference in C/C++. Contribute to ggml-org/llama.cpp development by creating an account on GitHub.

mental cypress Aug 12, 2025, 7:58 PM

#

save_to_gguf currently out of service and being revised

#

there is a notebook here shared by Lee. Use the code from that notebook

#

https://discord.com/channels/1179035537009545276/1399288019223318548

fresh briar Aug 12, 2025, 7:59 PM

#

I just added

!pip install "mistral_common>=0.0.8"

and it seems to be working now.

mental cypress Aug 12, 2025, 8:00 PM

#

slothfire jesus ok

fresh briar Aug 12, 2025, 8:01 PM

#

not jesus. gemini 🙂

mental cypress Aug 12, 2025, 8:06 PM

#

hahaha. good one

#

😄

lethal viper Aug 13, 2025, 8:30 AM

#

fresh briar I just added !pip install "mistral_common>=0.0.8" and it seems to be working...

Can you use the fine-tuned model on your computer? I use the following code thanks to chatgpt and it covers gguf on hugging face, but can't use model on local computer.

!pip install -U sentencepiece huggingface_hub numpy mistral_common
!git clone --recursive https://github.com/ggerganov/llama.cpp
%cd llama.cpp
!make clean && make all -j

GitHub

GitHub - ggml-org/llama.cpp: LLM inference in C/C++

LLM inference in C/C++. Contribute to ggml-org/llama.cpp development by creating an account on GitHub.

#Llama 3.1:8b notebook error

This might take 3 minutes... Traceback (most recent call last): File "/content/llama.cpp/convert_hf_to_gguf.py", line 32, in <module> from mistral_common.tokens.tokenizers.base import TokenizerVersion ModuleNotFoundError: No module named 'mistral_common'

This might take 3 minutes...
Traceback (most recent call last):
File "/content/llama.cpp/convert_hf_to_gguf.py", line 32, in <module>
from mistral_common.tokens.tokenizers.base import TokenizerVersion
ModuleNotFoundError: No module named 'mistral_common'