{"id":75,"library":"tiktoken","title":"tiktoken","description":"Fast BPE tokenizer from OpenAI, written in Rust. Used to count tokens and encode/decode text for OpenAI models. 3-6x faster than comparable Python tokenizers. Does NOT call any API — purely local computation. Requires a Rust compiler at build time on platforms without pre-built wheels. Package name and import name are both 'tiktoken'.","status":"active","version":"0.12.0","language":"python","source_language":"en","source_url":"https://github.com/openai/tiktoken","tags":["tiktoken","openai","tokenizer","bpe","token-counting","gpt-4o","cl100k","o200k","context-window"],"install":[{"cmd":"pip install tiktoken","lang":"bash","label":"Standard install"},{"cmd":"pip install tiktoken --no-binary tiktoken","lang":"bash","label":"Build from source (requires Rust toolchain)"}],"dependencies":[{"reason":"Required. Used for the BPE pre-tokenization patterns.","package":"regex","optional":false},{"reason":"Required. Used to download encoding vocab files on first use (fetches from OpenAI CDN). Can be disabled for offline use via TIKTOKEN_CACHE_DIR.","package":"requests","optional":false},{"reason":"Optional. Used as an alternative backend for fetching encoding files.","package":"blobfile","optional":true}],"imports":[{"note":"get_encoding() takes an encoding name (e.g. 'o200k_base'), NOT a model name. Passing a model name raises KeyError.","wrong":"import tiktoken; enc = tiktoken.get_encoding('gpt-4o')","symbol":"get_encoding","correct":"import tiktoken; enc = tiktoken.get_encoding('o200k_base')"},{"note":"encoding_for_model() takes a model name and returns the correct encoding. Preferred over hardcoding encoding names.","symbol":"encoding_for_model","correct":"import tiktoken; enc = tiktoken.encoding_for_model('gpt-4o')"}],"quickstart":{"code":"import tiktoken\n\n# Get encoding by model name (recommended)\nenc = tiktoken.encoding_for_model('gpt-4o')  # returns o200k_base\n\n# Or get encoding directly by name\nenc = tiktoken.get_encoding('o200k_base')\n\n# Encode text → list of token integers\ntokens = enc.encode('Hello, world!')\nprint(tokens)        # [9906, 11, 1917, 0]\nprint(len(tokens))   # 4\n\n# Decode tokens → string\ntext = enc.decode(tokens)\nprint(text)  # 'Hello, world!'\n\n# Count tokens for a chat message (accounts for message overhead)\ndef count_chat_tokens(messages: list[dict], model: str = 'gpt-4o') -> int:\n    enc = tiktoken.encoding_for_model(model)\n    tokens_per_message = 3  # every message has <|im_start|>, role, <|im_end|>\n    tokens_per_name = 1\n    total = 0\n    for msg in messages:\n        total += tokens_per_message\n        for key, value in msg.items():\n            total += len(enc.encode(value))\n            if key == 'name':\n                total += tokens_per_name\n    total += 3  # reply is primed with <|im_start|>assistant\n    return total\n\nmessages = [{'role': 'user', 'content': 'How many tokens is this?'}]\nprint(count_chat_tokens(messages))  # ~12","lang":"python","description":"encoding_for_model() is safer than hardcoding encoding names — it handles model→encoding mapping automatically and stays correct as OpenAI adds new models. Token counts for chat completions must add per-message overhead (3 tokens per message) to get accurate billing estimates."},"warnings":[{"fix":"Use tiktoken.encoding_for_model('gpt-4o') to look up by model name, or pass the correct encoding name to get_encoding().","message":"get_encoding() takes an encoding name, NOT a model name. tiktoken.get_encoding('gpt-4o') raises KeyError. tiktoken.get_encoding('cl100k_base') is correct. This is the single most common error for new users.","severity":"breaking","affected_versions":"all"},{"fix":"Always use tiktoken.encoding_for_model(model_name) to get the correct encoding. Never hardcode cl100k_base as a universal default.","message":"gpt-4o and all o-series models (o1, o3, o4-mini) use o200k_base, NOT cl100k_base. Code that hardcodes cl100k_base for all OpenAI models silently produces wrong token counts for newer models — undercounting or overcounting by up to 10-15% depending on content.","severity":"breaking","affected_versions":"all"},{"fix":"Pre-warm the cache in a networked environment: tiktoken.get_encoding('o200k_base') and tiktoken.get_encoding('cl100k_base'). Set TIKTOKEN_CACHE_DIR to a writable path. Vocab files are then reused from disk.","message":"On first use, tiktoken downloads encoding vocab files (~1MB each) from OpenAI's CDN. In firewalled/offline environments this silently hangs or raises a connection error, not an ImportError.","severity":"gotcha","affected_versions":"all"},{"fix":"Use the OpenAI Cookbook's num_tokens_from_messages() pattern which adds per-message overhead. Don't use raw encode() length for chat token budgeting.","message":"Token count from enc.encode(text) counts tokens in raw text only. Chat completions add 3 overhead tokens per message and 3 tokens for the assistant reply primer. Omitting this overhead causes off-by-N errors in context window management.","severity":"gotcha","affected_versions":"all"},{"fix":"Install Rust via rustup (https://rustup.rs) before pip install if on an unsupported platform. Or use a Docker image with tiktoken pre-installed.","message":"Building from source requires Rust. On platforms or Python versions without a pre-built wheel, pip install tiktoken triggers a Rust compile. If Rust is not installed, the install fails with a cryptic error about 'cargo' not found.","severity":"gotcha","affected_versions":"all"},{"fix":"enc.encode('<|endoftext|>', allowed_special={'<|endoftext|>'}) or enc.encode(text, allowed_special='all') to allow all special tokens.","message":"Special tokens like <|endoftext|> are not encoded by default. enc.encode('<|endoftext|>') raises ValueError. Must explicitly allow them.","severity":"gotcha","affected_versions":"all"}],"env_vars":null,"last_verified":"2026-05-12T07:27:33.895Z","next_check":"2026-05-28T00:00:00.000Z","problems":[{"fix":"pip install tiktoken","cause":"The 'tiktoken' package is not installed in the Python environment where the script is being executed.","error":"ModuleNotFoundError: No module named 'tiktoken'"},{"fix":"Ensure you have the Rust toolchain installed (e.g., via 'rustup' for most platforms) or essential build tools (like 'build-essential' on Linux, Xcode Command Line Tools on macOS) before running `pip install tiktoken`.","cause":"tiktoken requires a Rust compiler to build from source if a pre-built wheel isn't available for your specific platform and Python version; the Rust toolchain or necessary system build tools are missing.","error":"Failed building wheel for tiktoken"},{"fix":"Verify the model name against OpenAI's documentation (e.g., 'gpt-4', 'gpt-3.5-turbo') and ensure your `tiktoken` library is up-to-date by running `pip install --upgrade tiktoken`.","cause":"The model name provided to `tiktoken.encoding_for_model()` is either incorrect, misspelled, or corresponds to a newer model that is not yet supported by your installed tiktoken version.","error":"KeyError: This model is not supported"}],"ecosystem":"pypi","meta_description":null,"install_score":85,"install_tag":"verified","quickstart_score":80,"quickstart_tag":"verified","pypi_latest":null,"cli_name":null,"install_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"installs cleanly on critical runtimes, fast import, recently tested","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.01,"mem_mb":51.7,"disk_size":"27.4M"},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.96,"mem_mb":51.7,"disk_size":"28M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.21,"mem_mb":52.8,"disk_size":"29.9M"},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":7.8,"mem_mb":52.8,"disk_size":"31M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.83,"mem_mb":52.5,"disk_size":"21.6M"},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.25,"mem_mb":52.5,"disk_size":"22M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.5,"mem_mb":52.8,"disk_size":"21.2M"},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.34,"mem_mb":52.8,"disk_size":"22M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":5.53,"mem_mb":51.5,"disk_size":"26.7M"},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":0,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":4.89,"mem_mb":51.5,"disk_size":"28M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null}]},"quickstart_checks":{"last_tested":"2026-05-12","tag":"verified","tag_description":"quickstart runs on critical runtimes, recently tested","results":[{"runtime":"python:3.10-alpine","exit_code":0},{"runtime":"python:3.10-slim","exit_code":0},{"runtime":"python:3.11-alpine","exit_code":0},{"runtime":"python:3.11-slim","exit_code":0},{"runtime":"python:3.12-alpine","exit_code":0},{"runtime":"python:3.12-slim","exit_code":0},{"runtime":"python:3.13-alpine","exit_code":0},{"runtime":"python:3.13-slim","exit_code":0},{"runtime":"python:3.9-alpine","exit_code":0},{"runtime":"python:3.9-slim","exit_code":0}]}}