#Issues getting the token usage with streaming chats

1 messages · Page 1 of 1 (latest)

echo condor
#

hey, ive been trying to calculate the total tokens used by prompts while using the OpenAI api and it seems like i can never get it to be accurate. i think when i send the messages there is some extra metadata that gets added to the prompt and i cant count those tokens with tiktoken, if there is any way for me to know what is added that would be neat as that would mean i could finally know the token usage accurately while using the streaming api
i forgot to mention that i can get the tokens used in the response and its 1 on 1 with the non streaming results

#

as context, here is the results

#

here is the code:

use std::io::{stdout, Write};
use openai_macros::{ai_agent, message};
use openai_utils::{api_key, calculate_tokens, FunctionCall, Message};
use tiktoken_rs::{ChatCompletionRequestMessage, get_chat_completion_max_tokens};

macro_rules! print_chat {
    ($l:ident) => {
        while let Some(content) = $l.receive_content(0).await? {
            print!("{content}");
            stdout().flush()?;
        }
    };
}

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    dotenv::dotenv().unwrap();
    api_key(std::env::var("OPENAI_API_KEY").unwrap());

    let agent = ai_agent! {
        model: "gpt-3.5-turbo",
        temperature: 0.0,
        system_message: "give the same sentence back in 5 other languages",
        messages: message!(user, content: "hello my name is robertas")
    };

    let mut receiver = agent.create_stream().unwrap();

    print_chat!(receiver);

    println!();
    println!();
    println!("stream: {:#?}", receiver.construct_chat().await?.usage);

    let res = agent.create().await?.usage;

    println!("normal: {:#?}", res);

    Ok(())

}
#

if you dont understand the language but know how to help i can explain what this does

grand pond
#

each message has tags and role that is also counted

echo condor
#

i actually looked over it and i couldnt get how it summed up to 27, maybe you could explain how it would be represented?

#

from what i see the closest way it could be represented is without including the {} and "" in the resposne and jsut using plaintext?

#
println!("{}", calculate_tokens("<|im_start|>"));
    println!("{}", calculate_tokens("<|im_end|>"));
    println!("{}", calculate_tokens("assistant"));
    println!("{}", calculate_tokens("user"));
    println!("{}", calculate_tokens("system"));
    println!("{}", calculate_tokens("function"));
grand pond
echo condor
#

wait so if i see this right then the whole message is turned into json and then encoded?

grand pond
#

each message takes a fixed number of tokens based on chatml version

echo condor
#

that of course doesnt match up with the chatml

grand pond
#

openai has stealth changed it few times but its easy to match up when you remember to check it vs usage page when you begin to use a new model

echo condor
#

ah so i guess the easiest way is to just take the non streaming response prompt usage - the streaming one for each message type?

#

and the tokens for the user i would guess

#

oh and the function call... maybe imma finish this tomorrow because i will have to redo the whole thing lol

#

i would be happy to be able to reconstruct it into chatml so its easier to change later on but whatever.

grand pond
#

just count all text tokens normally and then multiply message count with the proper chatml usage count

echo condor
#

ok, i will make sure to check the user string token usage as well later

#

i guess this is going to be an another friday afternoon finding all this out and implementing it

#

well i think the biggest thing i could hope for is for openai to return a usage object as the last response within a stream

#

but thats gonna happen after i probably get this 4 hour job done (in the worst case)

grand pond
#

streaming has been available for ages but they never added the usage to responses even tho thousands of users have asked for it 🤷‍♂️

echo condor
#

yeah, well i might later make this into an api for others to use, but i bet there is one already which i just dont know about

#

btw thanks for the help!

grand pond
#

np

echo condor
#

ok well thanks for the help a ton, i got it to match up perfectly now!