How to deploy Unsloth train model from Huggingface repo? | Unsloth AI | Page 1

silk heart Mar 20, 2024, 6:46 AM

#

Hi, appreciate this fantastic community so far.
I published a model to https://huggingface.co/pacozaa/lora_model_tinyllama_alpaca_first now if I want to deploy it to say "production" for inference api call, what do I do next? I see some model has deploy button and other options as well. Maybe not relate to Unsloth directly but any documentation recommended read would be perfect!

pacozaa/lora_model_tinyllama_alpaca_first · Hugging Face

regal geyser Mar 20, 2024, 6:58 AM

#

You have only pushed adapters by the way.

#

It would be easier if you quantised it to GGUF or merge it to 16 Bit.

#

Loading adapters onto the base model requires a lot of ram and if you don't have good hardware on HuggingFace, it is not possible.

silk heart Mar 20, 2024, 8:27 AM

#

@regal geyser we need GGUF or 16bit of pytorch model to deploy? Will try

regal geyser Mar 20, 2024, 8:39 AM

#

silk heart <@805550380334579772> we need GGUF or 16bit of pytorch model to deploy? Will try

Yes.

regal geyser Mar 20, 2024, 8:42 AM

#

silk heart <@805550380334579772> we need GGUF or 16bit of pytorch model to deploy? Will try

Unless you have good hardware and can load the model with LORA. But you need good code.

#

If you convert to GGUF, then I can deploy it, then you can clone my Space.

regal geyser Mar 20, 2024, 8:43 AM

#

silk heart <@805550380334579772> we need GGUF or 16bit of pytorch model to deploy? Will try

Actually, don't do 16 Bit, you need good hardware.

#

For free Spaces, do GGUF.

silk heart Mar 20, 2024, 8:45 AM

#

alright will try!

regal geyser Mar 20, 2024, 8:45 AM

#

Cool.

#

Just send the link to the GGUF model and I will make you a inference space that you can duplicate!

#

When I have time.

silk heart Mar 20, 2024, 8:47 AM

#

I just pay for " pay as you go" in Google Colab, mayb it will do

regal geyser Mar 20, 2024, 8:50 AM

#

No, for deploying on Spaces.

regal geyser Mar 20, 2024, 8:50 AM

#

regal geyser Actually, don't do 16 Bit, you need good hardware.

Colab is fine for converting 16 bit.

regal geyser Mar 20, 2024, 8:50 AM

#

silk heart I just pay for " pay as you go" in Google Colab, mayb it will do

I meant on deploying on Spaces lol.

silk heart Mar 20, 2024, 8:51 AM

#

but I can push gguf from unsloth example colab right?

regal geyser Mar 20, 2024, 8:52 AM

#

Yes.

regal geyser Mar 20, 2024, 8:55 AM

#

silk heart but I can push gguf from unsloth example colab right?

Both 16 Bit and GGUF are possbile to push from Colab , it is just that free Spaces from HuggingFace doesn't have enough RAM for 16 Bit.

silk heart Mar 20, 2024, 8:55 AM

#

I see.

#

Convertingggg

regal geyser Mar 20, 2024, 8:57 AM

#

Oh no.

#

Stop it.

#

Convert it to q4_k_m!

#

@silk heart

#

@silk heart

silk heart Mar 20, 2024, 8:58 AM

#

okok lol

regal geyser Mar 20, 2024, 8:59 AM

#

Hahahah... Q8_0 is way too big for Spaces.

silk heart Mar 20, 2024, 8:59 AM

#

ok q4_k_m right now

regal geyser Mar 20, 2024, 8:59 AM

#

Cool!

#

By the way, is this a chatbot?

silk heart Mar 20, 2024, 9:00 AM

#

will have to go in a bit tho, moving house with two newborns, maybe seee you in like 12 hours

silk heart Mar 20, 2024, 9:00 AM

#

regal geyser By the way, is this a chatbot?

Nah man, just my first try of the whole pipeline

regal geyser Mar 20, 2024, 9:00 AM

#

I am only 14, no hassle of children lol.

regal geyser Mar 20, 2024, 9:00 AM

#

silk heart will have to go in a bit tho, moving house with two newborns, maybe seee you in ...

Same.

silk heart Mar 20, 2024, 9:00 AM

#

just follow unsloth example

#

alright later man Thankss

regal geyser Mar 20, 2024, 9:01 AM

#

@silk heart See ya!

silk heart Mar 20, 2024, 4:30 PM

#

Done done

rotund hazel Mar 20, 2024, 5:14 PM

#

Hopefully resolved guys? 👏

regal geyser Mar 20, 2024, 5:14 PM

#

Not yet.

regal geyser Mar 20, 2024, 5:15 PM

#

silk heart Done done

The link to the model?

silk heart Mar 20, 2024, 8:43 PM

#

https://huggingface.co/pacozaa/lora_model_tinyllama_alpaca_first

pacozaa/lora_model_tinyllama_alpaca_first · Hugging Face

regal geyser Mar 21, 2024, 4:15 AM

#

https://huggingface.co/spaces/mahiatlinux/lora_model_tinyllama_alpaca_first

Custom TinyLlama-GGUF - a Hugging Face Space by mahiatlinux

#

Here you go!

#

Duplicate it.

#

By the way TinyLlama isn't that smart. So the responses are really bad.

#

@silk heart

silk heart Mar 21, 2024, 4:36 AM

#

Thankss. Let me close this.

regal geyser Mar 21, 2024, 4:47 AM

#

Cool.

silk heart Mar 21, 2024, 5:16 AM

#

yeah it's really bad lol

regal geyser Mar 21, 2024, 6:07 AM

#

Yep...

#How to deploy Unsloth train model from Huggingface repo?