tiktoken
Fast BPE tokenizer from OpenAI, written in Rust. Used to count tokens and encode/decode text for OpenAI models. 3-6x faster than comparable Python tokenizers. Does NOT call any API — purely local computation. Requires a Rust compiler at build time on platforms without pre-built wheels. Package name and import name are both 'tiktoken'.
Warnings
- breaking get_encoding() takes an encoding name, NOT a model name. tiktoken.get_encoding('gpt-4o') raises KeyError. tiktoken.get_encoding('cl100k_base') is correct. This is the single most common error for new users.
- breaking gpt-4o and all o-series models (o1, o3, o4-mini) use o200k_base, NOT cl100k_base. Code that hardcodes cl100k_base for all OpenAI models silently produces wrong token counts for newer models — undercounting or overcounting by up to 10-15% depending on content.
- gotcha On first use, tiktoken downloads encoding vocab files (~1MB each) from OpenAI's CDN. In firewalled/offline environments this silently hangs or raises a connection error, not an ImportError.
- gotcha Token count from enc.encode(text) counts tokens in raw text only. Chat completions add 3 overhead tokens per message and 3 tokens for the assistant reply primer. Omitting this overhead causes off-by-N errors in context window management.
- gotcha Building from source requires Rust. On platforms or Python versions without a pre-built wheel, pip install tiktoken triggers a Rust compile. If Rust is not installed, the install fails with a cryptic error about 'cargo' not found.
- gotcha Special tokens like <|endoftext|> are not encoded by default. enc.encode('<|endoftext|>') raises ValueError. Must explicitly allow them.
Install
-
pip install tiktoken -
pip install tiktoken --no-binary tiktoken
Imports
- get_encoding
import tiktoken; enc = tiktoken.get_encoding('o200k_base') - encoding_for_model
import tiktoken; enc = tiktoken.encoding_for_model('gpt-4o')
Quickstart
import tiktoken
# Get encoding by model name (recommended)
enc = tiktoken.encoding_for_model('gpt-4o') # returns o200k_base
# Or get encoding directly by name
enc = tiktoken.get_encoding('o200k_base')
# Encode text → list of token integers
tokens = enc.encode('Hello, world!')
print(tokens) # [9906, 11, 1917, 0]
print(len(tokens)) # 4
# Decode tokens → string
text = enc.decode(tokens)
print(text) # 'Hello, world!'
# Count tokens for a chat message (accounts for message overhead)
def count_chat_tokens(messages: list[dict], model: str = 'gpt-4o') -> int:
enc = tiktoken.encoding_for_model(model)
tokens_per_message = 3 # every message has <|im_start|>, role, <|im_end|>
tokens_per_name = 1
total = 0
for msg in messages:
total += tokens_per_message
for key, value in msg.items():
total += len(enc.encode(value))
if key == 'name':
total += tokens_per_name
total += 3 # reply is primed with <|im_start|>assistant
return total
messages = [{'role': 'user', 'content': 'How many tokens is this?'}]
print(count_chat_tokens(messages)) # ~12