Hi, appreciate this fantastic community so far.
I published a model to https://huggingface.co/pacozaa/lora_model_tinyllama_alpaca_first now if I want to deploy it to say "production" for inference api call, what do I do next? I see some model has deploy button and other options as well. Maybe not relate to Unsloth directly but any documentation recommended read would be perfect!
#How to deploy Unsloth train model from Huggingface repo?
54 messages · Page 1 of 1 (latest)
You have only pushed adapters by the way.
It would be easier if you quantised it to GGUF or merge it to 16 Bit.
Loading adapters onto the base model requires a lot of ram and if you don't have good hardware on HuggingFace, it is not possible.
@regal geyser we need GGUF or 16bit of pytorch model to deploy? Will try
Yes.
Unless you have good hardware and can load the model with LORA. But you need good code.
If you convert to GGUF, then I can deploy it, then you can clone my Space.
Actually, don't do 16 Bit, you need good hardware.
For free Spaces, do GGUF.
alright will try!
Cool.
Just send the link to the GGUF model and I will make you a inference space that you can duplicate!
When I have time.
I just pay for " pay as you go" in Google Colab, mayb it will do
No, for deploying on Spaces.
Colab is fine for converting 16 bit.
I meant on deploying on Spaces lol.
but I can push gguf from unsloth example colab right?
Yes.
Both 16 Bit and GGUF are possbile to push from Colab , it is just that free Spaces from HuggingFace doesn't have enough RAM for 16 Bit.
okok lol
Hahahah... Q8_0 is way too big for Spaces.
ok q4_k_m right now
will have to go in a bit tho, moving house with two newborns, maybe seee you in like 12 hours
Nah man, just my first try of the whole pipeline
I am only 14, no hassle of children lol.
Same.
@silk heart See ya!
Done done
Hopefully resolved guys? 👏
Not yet.
The link to the model?
Here you go!
Duplicate it.
By the way TinyLlama isn't that smart. So the responses are really bad.
@silk heart
Thankss. Let me close this.
Cool.
yeah it's really bad lol
Yep...