Chat Template for Command A | Cohere | Page 1

molten knot Apr 16, 2025, 11:20 AM

#

Hi, I'm a contributor to a frontend project for running open weights AI models locally. I'm just trying to work out how Command A differs from Command R in terms of chat template. The model seems to work fine with the Command R template but maybe we're missing out on the best it has to offer.

I can see that there's <|START_RESPONSE|> and <|END_RESPONSE|> as well as tool use-specific tokens and reasoning-specific tokens. For Command-R as far as I know we have none of these.

A) My first question is, should all assistant messages be wrapped with <|START_RESPONSE|> and <|END_RESPONSE|> like:

<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|><|START_RESPONSE|>My response here<|END_RESPONSE|><|END_OF_TURN_TOKEN|>

Or does that only happen when there are reasoning & tool use calls, and otherwise use the Command-R style template:

<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>My response here<|END_OF_TURN_TOKEN|>

B) And how does this interact with reasoning? Presumably it's something like:

<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|><|START_THINKING|>My reasoning here<|END_THINKING|><|START_RESPONSE|>My response here<|END_RESPONSE|><|END_OF_TURN_TOKEN|>

C) Finally, am I right in thinking that these new tokens only apply to assistant messages and not system/user messages?

ornate grove Apr 17, 2025, 5:01 AM

#

You can try hitting the chat API with return_prompt=true to see how different requests map to the prompt

molten knot Apr 17, 2025, 11:01 AM

#

ornate grove You can try hitting the chat API with return_prompt=true to see how different re...

Would using transformers to .apply_chat_template work?

#

I just want to make sure I'm hitting all the possible combinations

#Chat Template for Command A