#[INST]

56 messages · Page 1 of 1 (latest)

silent adder
#

Finetuned model Phi-3.5-Mini-Instruct does not stop generating. Example:

The Eiffel Tower is a famous tall tower in Paris. It is an iconic symbol of France and one of the most recognizable structures in the world. Designed by Gustave Eiffel and completed in 1889, it stands at approximately 324 meters (1,063 feet) tall, not including antennas, and was the tallest man-made structure in the world until the completion of the Chrysler Building in New York in 1930.

[INST] What is the name of the tallest tower in the world, its location, and...

I used the mistral format.

echo ermine
#

First you should use phi chat template. Second I would recommend other fine-tuning friendly models such as Qwen 2.5. Phi is not that good.

silent adder
#

i couldnt seem to find out the phi chat template because there were so many variations

#

and that the search results werent clear

silent adder
echo ermine
#

Use other better models;)

#

Any one fits your need is good

silent adder
#

because im specifically looking for smaller qwen 2.5 models

#

well, smaller models

#

hmm okay i found one thanks bro

#

i appreciate it

echo ermine
#

I don't know which one is better since I don't use small ones

#

Np

silent adder
#

what about for larger ones?

#

the ones that you use

echo ermine
#

You should try base models first instead of instruct ones

silent adder
echo ermine
#

8b 14b

silent adder
#

aight

#

thanks

echo ermine
#

Np

silent adder
#

is it possible to use mistral format instead of alpaca?

#

just for future use

midnight torrent
#

When it comes to formats you can use anything you want basically. Unless that format uses specific tokens supported by specific models

#

You can still use those

silent adder
#

well, is mistral good?

#

cause im not sure if its still going to keep saying [INST]

midnight torrent
#

But the model tuned with that token will be better

echo ermine
#

Would suggest the one which is used by the instruct model for beginners

silent adder
#

does alpaca support multi turn conversations?

echo ermine
#

Recommend sharegpt for that

silent adder
#

hmm alright

silent adder
#

because im worried about the system prompt

#

as i may have to use a default "You are a helpful assistant"

#

because manually going through the dataset at this point would take ages

echo ermine
#

You can convert anything to anything. System prompt is just like a regular prompt, you can put it in any prompt.

silent adder
#

ChatML uses this format:

<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}

should i leave the system prompt blank?

#

or a default "You are a helpful assistant"

echo ermine
#

Either

silent adder
#

i guess i could just do a random chance and assign the system prompt to either blank or the assistant one.

silent adder
echo ermine
#

Np

silent adder
#

hey, quick question, can chat ml handle multi turn conversations?

#

like

#
<|im_start|>system
{System}
<|im_end|>
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}
<|im_start|>user
{User}
<|im_end|>
<|im_start|>assistant
{Assistant}
silent adder
#

Also can't you just convert mistral format to chatml during pre proccessing?

high topaz
#

mistral instruct is trained on its instruct format ..

#

if you want to switch that to chatml you will need way more data

#

and train the base model

#

also train the embeddings and heads

#

aka cont. pretrain

#

models dont pick the format blindly

#

labs do test what works best

silent adder
#

what format does qwen use

#

?