Hi, I'm a contributor to a frontend project for running open weights AI models locally. I'm just trying to work out how Command A differs from Command R in terms of chat template. The model seems to work fine with the Command R template but maybe we're missing out on the best it has to offer.
I can see that there's <|START_RESPONSE|> and <|END_RESPONSE|> as well as tool use-specific tokens and reasoning-specific tokens. For Command-R as far as I know we have none of these.
A) My first question is, should all assistant messages be wrapped with <|START_RESPONSE|> and <|END_RESPONSE|> like:
<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|><|START_RESPONSE|>My response here<|END_RESPONSE|><|END_OF_TURN_TOKEN|>
Or does that only happen when there are reasoning & tool use calls, and otherwise use the Command-R style template:
<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>My response here<|END_OF_TURN_TOKEN|>
B) And how does this interact with reasoning? Presumably it's something like:
<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|><|START_THINKING|>My reasoning here<|END_THINKING|><|START_RESPONSE|>My response here<|END_RESPONSE|><|END_OF_TURN_TOKEN|>
C) Finally, am I right in thinking that these new tokens only apply to assistant messages and not system/user messages?