Can't convert finetuned llm to gguf file | Unsloth AI | Page 1

fiery anchor May 4, 2024, 10:54 AM

#

Had finetuned mistral with unsloth, and these are the files in the huggingface repo...im using llama.cpp to convert to gguf file and when running the convert.py file, getting this error:

  File "E:\SuperServer.AI\docker-llm\llama.cpp\convert.py", line 1567, in <module>
    main()
  File "E:\SuperServer.AI\docker-llm\llama.cpp\convert.py", line 1499, in main
    model_plus = load_some_model(args.model)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\SuperServer.AI\docker-llm\llama.cpp\convert.py", line 1369, in load_some_model
    raise FileNotFoundError(f"Can't find model in directory {path}")
FileNotFoundError: Can't find model in directory E:\SuperServer.AI\docker-llm\docker-llm-v1-hf

it seems my repo doesn't have valid files, what to do

blazing fiber May 4, 2024, 3:42 PM

#

it looks like the model is missing the config.json, copy this from your base model onto your fine-tuned model

supple furnace May 5, 2024, 12:26 PM

#

@fiery anchor oh u need to use model.push_to_hub_merged to merge to 16bit

fiery anchor May 5, 2024, 2:29 PM

#

supple furnace <@1025039473932775485> oh u need to use `model.push_to_hub_merged` to merge to 1...

what's the exact reason for this

supple furnace May 5, 2024, 3:27 PM

#

ur using QLoRA itself

#

hence the error

#

LoRA adapters

fiery anchor May 6, 2024, 6:56 AM

#

supple furnace ur using QLoRA itself

yes got it, i realized i didn't run that cell completely, so now im training all over again

#

but now when im running the trainer_stats=trainer.train() cell, this error is coming:
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs: query : shape=(2, 173, 8, 4, 128) (torch.float16) key : shape=(2, 173, 8, 4, 128) (torch.float16) value : shape=(2, 173, 8, 4, 128) (torch.float16) attn_bias : <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> p : 0.0 `[email protected]` is not supported because: xFormers wasn't build with CUDA support requires device with capability > (8, 0) but your GPU has capability (7, 5) (too old) operator wasn't built - see `python -m xformers.info` for more info `cutlassF` is not supported because: xFormers wasn't build with CUDA support operator wasn't built - see `python -m xformers.info` for more info `smallkF` is not supported because: max(query.shape[-1] != value.shape[-1]) > 32 xFormers wasn't build with CUDA support dtype=torch.float16 (supported: {torch.float32}) attn_bias type is <class 'xformers.ops.fmha.attn_bias.LowerTriangularMask'> operator wasn't built - see `python -m xformers.info` for more info operator does not support BMGHK format unsupported embed per head: 128

#

using T4 gpu, is it not compatible with xFormers?

blazing fiber May 6, 2024, 7:23 AM

#

it works for me try xformers<0.0.26

supple furnace May 6, 2024, 7:34 AM

#

ye u have u use the exact new instructions

#

%%capture
# Installs Unsloth, Xformers (Flash Attention) and all other packages!
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps "xformers<0.0.26" trl peft accelerate bitsandbytes

fiery anchor May 6, 2024, 8:21 AM

#

supple furnace ``` %%capture # Installs Unsloth, Xformers (Flash Attention) and all other packa...

used this only

#Can't convert finetuned llm to gguf file