All Languages Are NOT Created (Tokenized) Equal
Topbots
JUNE 15, 2023
Large language models such as ChatGPT process and generate text sequences by first splitting the text into smaller units called tokens. The ratio is 9 times that of English for Armenian and over 10 times that of English for Burmese. In other words, to express the same sentiment, some languages require up to 10 times more tokens.
Let's personalize your content