RustBPE Tokenizer

library 0.1.0 ·python

✓ verified May 24, 2026

RustBPE is a Python library that provides a fast Byte Pair Encoding (BPE) tokenizer implemented in Rust, with Python bindings. It is designed primarily for training GPT-style BPE tokenizers and offers features like parallel processing, GPT-4 style regex pre-tokenization, and direct export to the tiktoken format for efficient inference. Currently at version 0.1.0, it is an initial release, suggesting active and potentially rapid development.

Traffic · last 30 days ↑30% vs prev 7d · indexed Thu Apr 16 · updated Mon Jun 01

total hits 26

actors 11 distinct systems

last hit 19h ago ByteDance

GPTBot

OAI-SearchBot

MetaBot

Script

ByteDance

ClaudeBot

ChatGPT-User

Search engines

top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany · 🇸🇬 Singapore · 🇯🇵 Japan

Resources

packagepypi.org/project/rustbpe/ ↗

API endpoints

full doc /v1/registry/rustbpe

install /v1/registry/rustbpe/install

compatibility /v1/registry/rustbpe/compatibility