S3Tokenizer

JSON →
library 0.3.0 ·python
verified May 24, 2026

S3Tokenizer is a Python library that provides a reverse-engineered PyTorch implementation of the Supervised Semantic Speech Tokenizer (S3Tokenizer), originally proposed in CosyVoice. It enables high-throughput batch inference and online speech code extraction. The current version is 0.3.0, and the library demonstrates a rapid release cadence, frequently adding support for newer CosyVoice versions and improving audio processing capabilities.

total hits 30
actors 12 distinct systems
last hit 1d ago ByteDance
GPTBot
6
ByteDance
6
OAI-SearchBot
4
MetaBot
4
Script
2
ClaudeBot
1
ChatGPT-User
1
Search engines
1

top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇩🇪 Germany · 🇫🇷 France · 🇮🇳 India