Hey everyone! I’m using the new beta version with assistant 2.0 and I want to count the amount of tokens my assistant is consuming for pricing reasons. And I have a couple of issues/questions.
When I create a NEW thread and I ask the assistant a simple question like “How much is 2+2?” And the answer is simple like “The answer is 2”, the amount of tokens of run.usage is at least 350 tokens when it should be around 30 tokens? I’m thinking it is probably because of creating the assistant, new thread, and completion + prompt but I’m not sure if this is why.
When I use the same thread and start asking simple things, again like “How much is 2+2?”, “How much is 2-1?”, the amount of tokens start growing so much. I will attach an image with the consuming tokens and the expected tokens per prompt - calculation with Tokenizer.
ThreadCalculations
Are tokens cumulative in the same thread? If they are should I only count run.usage at the end of my assistant process to calculate costs? Or if someone knows why this is happening and has an answer I’d be grateful to hear it.
Thank you so much in advance.