#How to get the cost of a generation ?

12 messages · Page 1 of 1 (latest)

fast stream
#

How to get the cost of a generation ?

steep garnet
#

The data in activity are the same as the data returned by the generation endpoint afaik - can you point out the diff?

Basically on the activity page, if we show the "native token count" (using the tokenizer associated with each model). tokens_prompt is the normalized counting using GPT tokenizer, used for analytics and ranking

#

The "usage" is the cost, in credits

fast stream
#

@steep garnet Thank you very much I went too fast, I tested this late last night after a big day of code and I thought that usages corresponded to the generation speed... 🤦‍♂️

Is it possible to get the input usage and the output usage too ?

Ok, now I understand the difference between native_tokens and tokens, thanks! (I had looked in the doc but couldn't find anything about it)

I have another question: what's the 5€/1K request on the Perplexity models, is it included in the price for 1k tokens that you read in the doc page or is it extra? Does the generation.usage value for this model include all this or is the api key charged extra for the 5€/1K request?

Thanks for your quick reply!

bitter canopy
#

The docs page doesn't include per-request pricing yet, which is only relevant for multimodel models (when you submit images, you get charged per image) and online models (perplexity)

#

The /models page has everything

#

Sorry about that - docs page is getting redone soon

#

cc @tribal fable

fast stream
#

@bitter canopy Okay thanks, so for Perplexity models, does the usage returned by the /api/v1/generation endpoint include the full cost or only the token cost?
It would be nice if it either returned the total cost directly, or also included a new key with additional_usage, for example.

(Also, the /api/v1/generation endpoint response is wrapped in a data object, but not in the doc example. I don't know if this is normal, but I wanted to let you know)

Thanks

bitter canopy
#

@fast stream the usage in the generation endpoint includes the full cost

#

including the request cost

#

good catch re: data, will fix that in the docs!