Grok Code Fast 1 auto-including reasoning tokens in input? | OpenRouter | Page 1

keen tendon Sep 13, 2025, 2:25 AM

#

I can see on my activity page that OR is clearly charging me for input tokens as if I was including the reasoning tokens from all requests in each call. But here's the thing...I'm not.

                model=TESTEE_MODEL,
                messages=messages,
                temperature=TESTEE_TEMPERATURE,
                max_tokens=32000,
                extra_body={
                    "reasoning": {
                        "enabled": True,
#                        "max_tokens": 8000,
                        "effort": "low"
                    }
#                    "provider": {
#                        "only": ["cerebras"]
#                    }
                }
            )
question = completion.choices[0].message.content.strip()

This last line is the ONLY time I reference the returned answer from the model directly, and it is clearly getting message.content, which indeed is logged by my program as the short text answer, no reasoning. IE, nothing else can possibly reference the reasoning tokens. Like, the function ends immediately after this, returning only "question". Here is an example message history that showed as 8000+ tokens in Activity page.

https://pastebin.com/QhKDuM2n

#

This exact code is also not having this issue with GLM-4.5, reasoning on.

#

@frank mulch

#

Moved this here from the model chat so I don't clog it up.

keen tendon Sep 13, 2025, 6:30 PM

#

Wait, is this some weird caching thing? Even though past reasoning tokens are never fed into inference cat_huh

ember frost Sep 13, 2025, 6:31 PM

#

keen tendon Wait, is this some weird caching thing? Even though past reasoning tokens are ne...

i have a feeling it might be since the model seems to remeber what its planning to do (in its reasoning) from my experience

keen tendon Sep 13, 2025, 6:32 PM

#

Weird...Hard to do the math on if I'm being charged for this rn because the requests are small and input is cheap.

#Grok Code Fast 1 auto-including reasoning tokens in input?