tiktoken

0.12.0 · active · verified Sat Feb 28

Fast BPE tokenizer from OpenAI, written in Rust. Used to count tokens and encode/decode text for OpenAI models. 3-6x faster than comparable Python tokenizers. Does NOT call any API — purely local computation. Requires a Rust compiler at build time on platforms without pre-built wheels. Package name and import name are both 'tiktoken'.

Warnings

Install

Imports

Quickstart

encoding_for_model() is safer than hardcoding encoding names — it handles model→encoding mapping automatically and stays correct as OpenAI adds new models. Token counts for chat completions must add per-message overhead (3 tokens per message) to get accurate billing estimates.

import tiktoken

# Get encoding by model name (recommended)
enc = tiktoken.encoding_for_model('gpt-4o')  # returns o200k_base

# Or get encoding directly by name
enc = tiktoken.get_encoding('o200k_base')

# Encode text → list of token integers
tokens = enc.encode('Hello, world!')
print(tokens)        # [9906, 11, 1917, 0]
print(len(tokens))   # 4

# Decode tokens → string
text = enc.decode(tokens)
print(text)  # 'Hello, world!'

# Count tokens for a chat message (accounts for message overhead)
def count_chat_tokens(messages: list[dict], model: str = 'gpt-4o') -> int:
    enc = tiktoken.encoding_for_model(model)
    tokens_per_message = 3  # every message has <|im_start|>, role, <|im_end|>
    tokens_per_name = 1
    total = 0
    for msg in messages:
        total += tokens_per_message
        for key, value in msg.items():
            total += len(enc.encode(value))
            if key == 'name':
                total += tokens_per_name
    total += 3  # reply is primed with <|im_start|>assistant
    return total

messages = [{'role': 'user', 'content': 'How many tokens is this?'}]
print(count_chat_tokens(messages))  # ~12

view raw JSON →