tiktoken

raw JSON →
0.12.0 verified Tue May 12 auth: no python install: verified quickstart: verified

Fast BPE tokenizer from OpenAI, written in Rust. Used to count tokens and encode/decode text for OpenAI models. 3-6x faster than comparable Python tokenizers. Does NOT call any API — purely local computation. Requires a Rust compiler at build time on platforms without pre-built wheels. Package name and import name are both 'tiktoken'.

pip install tiktoken
error ModuleNotFoundError: No module named 'tiktoken'
cause The 'tiktoken' package is not installed in the Python environment where the script is being executed.
fix
pip install tiktoken
error Failed building wheel for tiktoken
cause tiktoken requires a Rust compiler to build from source if a pre-built wheel isn't available for your specific platform and Python version; the Rust toolchain or necessary system build tools are missing.
fix
Ensure you have the Rust toolchain installed (e.g., via 'rustup' for most platforms) or essential build tools (like 'build-essential' on Linux, Xcode Command Line Tools on macOS) before running pip install tiktoken.
error KeyError: This model is not supported
cause The model name provided to `tiktoken.encoding_for_model()` is either incorrect, misspelled, or corresponds to a newer model that is not yet supported by your installed tiktoken version.
fix
Verify the model name against OpenAI's documentation (e.g., 'gpt-4', 'gpt-3.5-turbo') and ensure your tiktoken library is up-to-date by running pip install --upgrade tiktoken.
breaking get_encoding() takes an encoding name, NOT a model name. tiktoken.get_encoding('gpt-4o') raises KeyError. tiktoken.get_encoding('cl100k_base') is correct. This is the single most common error for new users.
fix Use tiktoken.encoding_for_model('gpt-4o') to look up by model name, or pass the correct encoding name to get_encoding().
breaking gpt-4o and all o-series models (o1, o3, o4-mini) use o200k_base, NOT cl100k_base. Code that hardcodes cl100k_base for all OpenAI models silently produces wrong token counts for newer models — undercounting or overcounting by up to 10-15% depending on content.
fix Always use tiktoken.encoding_for_model(model_name) to get the correct encoding. Never hardcode cl100k_base as a universal default.
gotcha On first use, tiktoken downloads encoding vocab files (~1MB each) from OpenAI's CDN. In firewalled/offline environments this silently hangs or raises a connection error, not an ImportError.
fix Pre-warm the cache in a networked environment: tiktoken.get_encoding('o200k_base') and tiktoken.get_encoding('cl100k_base'). Set TIKTOKEN_CACHE_DIR to a writable path. Vocab files are then reused from disk.
gotcha Token count from enc.encode(text) counts tokens in raw text only. Chat completions add 3 overhead tokens per message and 3 tokens for the assistant reply primer. Omitting this overhead causes off-by-N errors in context window management.
fix Use the OpenAI Cookbook's num_tokens_from_messages() pattern which adds per-message overhead. Don't use raw encode() length for chat token budgeting.
gotcha Building from source requires Rust. On platforms or Python versions without a pre-built wheel, pip install tiktoken triggers a Rust compile. If Rust is not installed, the install fails with a cryptic error about 'cargo' not found.
fix Install Rust via rustup (https://rustup.rs) before pip install if on an unsupported platform. Or use a Docker image with tiktoken pre-installed.
gotcha Special tokens like <|endoftext|> are not encoded by default. enc.encode('<|endoftext|>') raises ValueError. Must explicitly allow them.
fix enc.encode('<|endoftext|>', allowed_special={'<|endoftext|>'}) or enc.encode(text, allowed_special='all') to allow all special tokens.
pip install tiktoken --no-binary tiktoken
python os / libc status wheel install import disk
3.10 alpine (musl) - - 5.01s 27.4M
3.10 alpine (musl) - - - -
3.10 slim (glibc) - - 4.96s 28M
3.10 slim (glibc) - - - -
3.11 alpine (musl) - - 5.21s 29.9M
3.11 alpine (musl) - - - -
3.11 slim (glibc) - - 7.80s 31M
3.11 slim (glibc) - - - -
3.12 alpine (musl) - - 4.83s 21.6M
3.12 alpine (musl) - - - -
3.12 slim (glibc) - - 5.25s 22M
3.12 slim (glibc) - - - -
3.13 alpine (musl) - - 5.50s 21.2M
3.13 alpine (musl) - - - -
3.13 slim (glibc) - - 5.34s 22M
3.13 slim (glibc) - - - -
3.9 alpine (musl) - - 5.53s 26.7M
3.9 alpine (musl) - - - -
3.9 slim (glibc) - - 4.89s 28M
3.9 slim (glibc) - - - -

encoding_for_model() is safer than hardcoding encoding names — it handles model→encoding mapping automatically and stays correct as OpenAI adds new models. Token counts for chat completions must add per-message overhead (3 tokens per message) to get accurate billing estimates.

import tiktoken

# Get encoding by model name (recommended)
enc = tiktoken.encoding_for_model('gpt-4o')  # returns o200k_base

# Or get encoding directly by name
enc = tiktoken.get_encoding('o200k_base')

# Encode text → list of token integers
tokens = enc.encode('Hello, world!')
print(tokens)        # [9906, 11, 1917, 0]
print(len(tokens))   # 4

# Decode tokens → string
text = enc.decode(tokens)
print(text)  # 'Hello, world!'

# Count tokens for a chat message (accounts for message overhead)
def count_chat_tokens(messages: list[dict], model: str = 'gpt-4o') -> int:
    enc = tiktoken.encoding_for_model(model)
    tokens_per_message = 3  # every message has <|im_start|>, role, <|im_end|>
    tokens_per_name = 1
    total = 0
    for msg in messages:
        total += tokens_per_message
        for key, value in msg.items():
            total += len(enc.encode(value))
            if key == 'name':
                total += tokens_per_name
    total += 3  # reply is primed with <|im_start|>assistant
    return total

messages = [{'role': 'user', 'content': 'How many tokens is this?'}]
print(count_chat_tokens(messages))  # ~12