#Fine-Tuning Llama3.1 Issues
30 messages · Page 1 of 1 (latest)
You mean like the alpaca prompt? Is there a way to use a custom template than the input, output? (I am using Ollama on my machine btw)
Ollama? Then it's quite likely the chat template issue, especially you modified the one you used to finetuned. I'm not familliar Ollama, you can check with their server.
So you think closing this issue and going to ollama's server would be better?
You can leave it open. Yea, ask in the Ollama server or just google how Ollama deals with templates.
Sadly, I didn't get any responses from Ollama's or Hugging Face's servers. But I found this and I believe you can write a custom template for your model too...It's just I don't really know how should I do this because my dataset is just basic response to response but I added the scenerio to teach the model how would It react in situations not just the input. I may be blinded by chatbots on character AI for example because I thought AI can interpret the current scenerio by itself.
I don't know what you did with the template but can you use the "standard" ones? Like use Alpaca? Your "scenarios" most likely can fit in the templates.
The current template that ended up being broken was the default one
the one on the picture
I can try to recreate the template I used during finetuning tomorrow if I understand everything correctly
But before that. The very slow response time isn't connected to that is it?
Why would it
Maybe Ollama used CPU instead of GPU for inference
The thing is that if I switch to the regular llama3.1 it responds fine
it's just the fine-tuned model
Ollama will try to use cpu if not enough vram if I can remember. How big is your fine-tuned model?
fined tuned is 16gb so kinda big
Your vram is?
32