#Attempt unsloth library on Mamba-2.8b
1 messages · Page 1 of 1 (latest)
Hi have you tried using our free Google Colab notebooks? https://github.com/unslothai/unsloth#-finetune-for-free
Unfortunately if you want to use Mamba, it will not work with unsloth, you'll need to make a lot of changes
is it precisely due to architecture difference so I have to redo the manual autograd?
Yes unfortunately. Mamba is quite different from transformers
Had a look yesterday - its exttremely different, like a whole new beast than transformers 😦
Im happy to do it as personal project but I kinda need to know that code-wise, how did you approach manual autograd with transformer in the first place.
p.s. im bad at jacobian derivative
oh ye sadly derivatives
but as a start, just try coding up Mamba from start to end
and compare losses 🙂
my interpretation of your suggestion so far is go to fast_lora.py and write manual autograd to Mamba architecture
Ohh no so the issue is Mamba isnt doing attention nor MLP anymore
fast lora will not work
https://huggingface.co/state-spaces/mamba-2.8b-hf this mamba can be Lora, so at least your def get_lora_parameters can work
on the other hand, their philosophy and yours seem awfully similar already
Oh maybe https://github.com/huggingface/transformers/blob/main/src/transformers/models/mamba/modeling_mamba.py will be useful
bump