tiktoken

0.12.0 verified Tue May 12 auth: no python install: verified quickstart: verified

Fast BPE tokenizer from OpenAI, written in Rust. Used to count tokens and encode/decode text for OpenAI models. 3-6x faster than comparable Python tokenizers. Does NOT call any API — purely local computation. Requires a Rust compiler at build time on platforms without pre-built wheels. Package name and import name are both 'tiktoken'.

pip install tiktoken

Common errors

error ModuleNotFoundError: No module named 'tiktoken' ↓

cause The 'tiktoken' package is not installed in the Python environment where the script is being executed.

fix

pip install tiktoken

error Failed building wheel for tiktoken ↓

cause tiktoken requires a Rust compiler to build from source if a pre-built wheel isn't available for your specific platform and Python version; the Rust toolchain or necessary system build tools are missing.

fix

Ensure you have the Rust toolchain installed (e.g., via 'rustup' for most platforms) or essential build tools (like 'build-essential' on Linux, Xcode Command Line Tools on macOS) before running pip install tiktoken.

error KeyError: This model is not supported ↓

cause The model name provided to `tiktoken.encoding_for_model()` is either incorrect, misspelled, or corresponds to a newer model that is not yet supported by your installed tiktoken version.

fix

Verify the model name against OpenAI's documentation (e.g., 'gpt-4', 'gpt-3.5-turbo') and ensure your tiktoken library is up-to-date by running pip install --upgrade tiktoken.

Warnings

breaking get_encoding() takes an encoding name, NOT a model name. tiktoken.get_encoding('gpt-4o') raises KeyError. tiktoken.get_encoding('cl100k_base') is correct. This is the single most common error for new users. ↓

fix Use tiktoken.encoding_for_model('gpt-4o') to look up by model name, or pass the correct encoding name to get_encoding().

breaking gpt-4o and all o-series models (o1, o3, o4-mini) use o200k_base, NOT cl100k_base. Code that hardcodes cl100k_base for all OpenAI models silently produces wrong token counts for newer models — undercounting or overcounting by up to 10-15% depending on content. ↓

fix Always use tiktoken.encoding_for_model(model_name) to get the correct encoding. Never hardcode cl100k_base as a universal default.

gotcha On first use, tiktoken downloads encoding vocab files (~1MB each) from OpenAI's CDN. In firewalled/offline environments this silently hangs or raises a connection error, not an ImportError. ↓

fix Pre-warm the cache in a networked environment: tiktoken.get_encoding('o200k_base') and tiktoken.get_encoding('cl100k_base'). Set TIKTOKEN_CACHE_DIR to a writable path. Vocab files are then reused from disk.

gotcha Token count from enc.encode(text) counts tokens in raw text only. Chat completions add 3 overhead tokens per message and 3 tokens for the assistant reply primer. Omitting this overhead causes off-by-N errors in context window management. ↓

fix Use the OpenAI Cookbook's num_tokens_from_messages() pattern which adds per-message overhead. Don't use raw encode() length for chat token budgeting.

gotcha Building from source requires Rust. On platforms or Python versions without a pre-built wheel, pip install tiktoken triggers a Rust compile. If Rust is not installed, the install fails with a cryptic error about 'cargo' not found. ↓

fix Install Rust via rustup (https://rustup.rs) before pip install if on an unsupported platform. Or use a Docker image with tiktoken pre-installed.

gotcha Special tokens like <|endoftext|> are not encoded by default. enc.encode('<|endoftext|>') raises ValueError. Must explicitly allow them. ↓

fix enc.encode('<|endoftext|>', allowed_special={'<|endoftext|>'}) or enc.encode(text, allowed_special='all') to allow all special tokens.

Install

pip install tiktoken --no-binary tiktoken

Install compatibility verified last tested: 2026-05-12

python os / libc status wheel install import disk

3.10 alpine (musl) - - 5.01s 27.4M

3.10 alpine (musl) - - - -

3.10 slim (glibc) - - 4.96s 28M

3.10 slim (glibc) - - - -

3.11 alpine (musl) - - 5.21s 29.9M

3.11 alpine (musl) - - - -

3.11 slim (glibc) - - 7.80s 31M

3.11 slim (glibc) - - - -

3.12 alpine (musl) - - 4.83s 21.6M

3.12 alpine (musl) - - - -

3.12 slim (glibc) - - 5.25s 22M

3.12 slim (glibc) - - - -

3.13 alpine (musl) - - 5.50s 21.2M

3.13 alpine (musl) - - - -

3.13 slim (glibc) - - 5.34s 22M

3.13 slim (glibc) - - - -

3.9 alpine (musl) - - 5.53s 26.7M

3.9 alpine (musl) - - - -

3.9 slim (glibc) - - 4.89s 28M

3.9 slim (glibc) - - - -

Imports

get_encoding
wrong
```
import tiktoken; enc = tiktoken.get_encoding('gpt-4o')
```
correct
```
import tiktoken; enc = tiktoken.get_encoding('o200k_base')
```
get_encoding() takes an encoding name (e.g. 'o200k_base'), NOT a model name. Passing a model name raises KeyError.
encoding_for_model
```
import tiktoken; enc = tiktoken.encoding_for_model('gpt-4o')
```
encoding_for_model() takes a model name and returns the correct encoding. Preferred over hardcoding encoding names.

Quickstart verified last tested: 2026-05-12

encoding_for_model() is safer than hardcoding encoding names — it handles model→encoding mapping automatically and stays correct as OpenAI adds new models. Token counts for chat completions must add per-message overhead (3 tokens per message) to get accurate billing estimates.

import tiktoken

# Get encoding by model name (recommended)
enc = tiktoken.encoding_for_model('gpt-4o')  # returns o200k_base

# Or get encoding directly by name
enc = tiktoken.get_encoding('o200k_base')

# Encode text → list of token integers
tokens = enc.encode('Hello, world!')
print(tokens)        # [9906, 11, 1917, 0]
print(len(tokens))   # 4

# Decode tokens → string
text = enc.decode(tokens)
print(text)  # 'Hello, world!'

# Count tokens for a chat message (accounts for message overhead)
def count_chat_tokens(messages: list[dict], model: str = 'gpt-4o') -> int:
    enc = tiktoken.encoding_for_model(model)
    tokens_per_message = 3  # every message has <|im_start|>, role, <|im_end|>
    tokens_per_name = 1
    total = 0
    for msg in messages:
        total += tokens_per_message
        for key, value in msg.items():
            total += len(enc.encode(value))
            if key == 'name':
                total += tokens_per_name
    total += 3  # reply is primed with <|im_start|>assistant
    return total

messages = [{'role': 'user', 'content': 'How many tokens is this?'}]
print(count_chat_tokens(messages))  # ~12