I can see on my activity page that OR is clearly charging me for input tokens as if I was including the reasoning tokens from all requests in each call. But here's the thing...I'm not.
model=TESTEE_MODEL,
messages=messages,
temperature=TESTEE_TEMPERATURE,
max_tokens=32000,
extra_body={
"reasoning": {
"enabled": True,
# "max_tokens": 8000,
"effort": "low"
}
# "provider": {
# "only": ["cerebras"]
# }
}
)
question = completion.choices[0].message.content.strip()
This last line is the ONLY time I reference the returned answer from the model directly, and it is clearly getting message.content, which indeed is logged by my program as the short text answer, no reasoning. IE, nothing else can possibly reference the reasoning tokens. Like, the function ends immediately after this, returning only "question". Here is an example message history that showed as 8000+ tokens in Activity page.
