RustBPE Tokenizer
JSON →RustBPE is a Python library that provides a fast Byte Pair Encoding (BPE) tokenizer implemented in Rust, with Python bindings. It is designed primarily for training GPT-style BPE tokenizers and offers features like parallel processing, GPT-4 style regex pre-tokenization, and direct export to the tiktoken format for efficient inference. Currently at version 0.1.0, it is an initial release, suggesting active and potentially rapid development.
Traffic · last 30 days ↑30% vs prev 7d
total hits 26
actors 11 distinct systems
last hit 19h ago ByteDance
top countries 🇺🇸 United States · 🇨🇦 Canada · 🇩🇪 Germany · 🇸🇬 Singapore · 🇯🇵 Japan
Resources
packagepypi.org/project/rustbpe/ ↗
API endpoints
full doc /v1/registry/rustbpe
install /v1/registry/rustbpe/install
compatibility /v1/registry/rustbpe/compatibility