Attempt unsloth library on Mamba-2.8b | Unsloth AI | Page 1

halcyon sandal Mar 15, 2024, 10:55 AM

#

I've read the introduction blog post, and the blog post about apply unsloth on gemma. Where do I start 🧐 ?

tropic stratus Mar 15, 2024, 10:57 AM

#

Hi have you tried using our free Google Colab notebooks? https://github.com/unslothai/unsloth#-finetune-for-free

GitHub

GitHub - unslothai/unsloth: 5X faster 60% less memory QLoRA finetuning

5X faster 60% less memory QLoRA finetuning. Contribute to unslothai/unsloth development by creating an account on GitHub.

#

Unfortunately if you want to use Mamba, it will not work with unsloth, you'll need to make a lot of changes

halcyon sandal Mar 16, 2024, 11:36 PM

#

tropic stratus Unfortunately if you want to use Mamba, it will not work with unsloth, you'll ne...

is it precisely due to architecture difference so I have to redo the manual autograd?

tropic stratus Mar 17, 2024, 12:56 AM

#

halcyon sandal is it precisely due to architecture difference so I have to redo the manual auto...

Yes unfortunately. Mamba is quite different from transformers

mint viper Mar 17, 2024, 2:44 AM

#

Had a look yesterday - its exttremely different, like a whole new beast than transformers 😦

halcyon sandal Mar 17, 2024, 3:55 PM

#

Im happy to do it as personal project but I kinda need to know that code-wise, how did you approach manual autograd with transformer in the first place.

p.s. im bad at jacobian derivative

mint viper Mar 17, 2024, 4:10 PM

#

oh ye sadly derivatives

#

but as a start, just try coding up Mamba from start to end

#

and compare losses 🙂

halcyon sandal Mar 18, 2024, 12:34 AM

#

mint viper but as a start, just try coding up Mamba from start to end

my interpretation of your suggestion so far is go to fast_lora.py and write manual autograd to Mamba architecture

mint viper Mar 18, 2024, 2:36 AM

#

Ohh no so the issue is Mamba isnt doing attention nor MLP anymore

#

fast lora will not work

halcyon sandal Mar 18, 2024, 7:03 AM

#

mint viper Ohh no so the issue is Mamba isnt doing attention nor MLP anymore

https://huggingface.co/state-spaces/mamba-2.8b-hf this mamba can be Lora, so at least your def get_lora_parameters can work

state-spaces/mamba-2.8b-hf · Hugging Face

#

on the other hand, their philosophy and yours seem awfully similar already

mint viper Mar 18, 2024, 7:45 AM

#

Oh maybe https://github.com/huggingface/transformers/blob/main/src/transformers/models/mamba/modeling_mamba.py will be useful

GitHub

transformers/src/transformers/models/mamba/modeling_mamba.py at mai...

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - huggingface/transformers

halcyon sandal Mar 22, 2024, 1:54 PM

#

bump

mint viper Mar 22, 2024, 2:28 PM

#

hmmm mamba will sadly have to wait - it looks overly complex to optimize

#

ill have to get myself more well versed with it

#Attempt unsloth library on Mamba-2.8b