#Which of the two ways to calculate tokens in java should I choose?I am using java to call the API o

2 messages · Page 1 of 1 (latest)

hardy ermine
#

Which of the two ways to calculate tokens in java should I choose?
I am using java to call the API of ChatGPT3.5, now I need to count the tokens of the question and answer, how should I choose?

" Tokenizer libraries by language
For cl100k_base and p50k_base encodings:
Python: tiktoken
.NET / C#: SharpToken, TiktokenSharp
Java: jtokkit ====>>>>https://github.com/knuddelsgmbh/jtokkit
For r50k_base (gpt2) encodings, tokenizers are available in many languages.
Python: tiktoken (or alternatively GPT2TokenizerFast)
JavaScript: gpt-3-encoder
.NET / C#: GPT Tokenizer
Java: gpt2-tokenizer-java=====>https://github.com/hyunwoongko/gpt2-tokenizer-java
PHP: GPT-3-Encoder-PHP
(OpenAI makes no endorsements or guarantees of third-party libraries.) "

GitHub

JTokkit is a Java tokenizer library designed for use with OpenAI models. - GitHub - knuddelsgmbh/jtokkit: JTokkit is a Java tokenizer library designed for use with OpenAI models.

GitHub

Java implementation of GPT2 tokenizer. Contribute to hyunwoongko/gpt2-tokenizer-java development by creating an account on GitHub.

trim nymph