#Hosting model on Huggingface's Inference Endpoints?

13 messages · Page 1 of 1 (latest)

golden bough
#

I am using https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp

I've trained the model and output the model to my huggingface account via this block:

model.save_pretrained("lora_model") # Local saving
tokenizer.save_pretrained("lora_model")
model.push_to_hub("Sera-Network/sera-llama", token = "hf_token") # Online saving

When I try to setup the api endpoint I get:

ValueError: `rope_scaling` must be a dictionary with with two fields, `type` and `factor`, got {'factor': 8.0, 'high_freq_factor': 4.0, 'low_freq_factor': 1.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}

Application startup failed. Exiting.

Have any of you experienced success with this?

wind scarab
#

I have experiences in that,

limpid mirage
#

I had the same issue, did you find the solution?

limpid mirage
#

@golden bough hey, did you found the solution how to resolve the problem?

regal crest
#

same happened to me

golden bough
#

No 😦

fading pilot
#

what was the base model you fine-tuned? What does your adapter_config.json look like?

limpid mirage
#

in my case base model - unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit

And here is adapter_config:

limpid mirage
#

I tried one more time with other fine tuned model, base model also is unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit but result is the same:

ValueError: rope_scalingmust be a dictionary with with two fields,typeandfactor, got {'factor': 8.0, 'high_freq_factor': 4.0, 'low_freq_factor': 1.0, 'original_max_position_embeddings': 8192, 'rope_type': 'llama3'}

fading pilot
gilded robin
#

config = model.config.to_dict()
if "rope_scaling" in config:
config["rope_scaling"] = {"type": "linear", "factor": config["rope_scaling"]["factor"]}
model.config.update(config)

#

The rope_scaling configuration typically includes just type and factor, but your configuration has additional fields like high_freq_factor, low_freq_factor, original_max_position_embeddings, and rope_type. These extra fields might be specific to the model you trained or the version of the code you used, but they aren't recognized by the standard implementation of Hugging Face's Transformers library.

fading pilot