Token
Token, “The word token traces back to the Old English noun tācen, meaning “sign, symbol, or evidence.” It is deeply rooted in Germanic history and ultimately stems from the Proto-Indo-European root *deyḱ-, which means “to show,” “point out,” or “teach.””
In the context of AI, a token is the basic unit of data that an AI model processes and generates. Instead of reading whole words, AI models like ChatGPT break text down into smaller fragments—such as parts of words, syllables, or punctuation—which are then assigned numbers and analyzed.
How Tokens Translate to Words
While exact ratios vary between models, a general rule of thumb for English text is:
- 1 token
4 characters
- 1 token
0.75 words (or 3/4 of a word)
- 100 tokens
75 words