S3Tokenizer

library 0.3.0 ·python

✓ verified May 24, 2026

ai-ml llm-agents aws

S3Tokenizer is a Python library that provides a reverse-engineered PyTorch implementation of the Supervised Semantic Speech Tokenizer (S3Tokenizer), originally proposed in CosyVoice. It enables high-throughput batch inference and online speech code extraction. The current version is 0.3.0, and the library demonstrates a rapid release cadence, frequently adding support for newer CosyVoice versions and improving audio processing capabilities.

Traffic · last 30 days ↑125% vs prev 7d · indexed Thu Apr 16 · updated Mon Jun 01

total hits 30

actors 12 distinct systems

last hit 1d ago ByteDance

GPTBot

6

ByteDance

6

OAI-SearchBot

4

MetaBot

4

Script

2

ClaudeBot

1

ChatGPT-User

1

Search engines

1

top countries 🇺🇸 United States · 🇸🇬 Singapore · 🇩🇪 Germany · 🇫🇷 France · 🇮🇳 India

Resources

githubgithub.com/xingchensong/S3Tokenizer ↗

packagepypi.org/project/s3tokenizer/ ↗

API endpoints

full doc /v1/registry/s3tokenizer

install /v1/registry/s3tokenizer/install

compatibility /v1/registry/s3tokenizer/compatibility