{"id":28445,"library":"voxcpm","title":"VoxCPM","description":"VoxCPM is a tokenizer-free text-to-speech (TTS) model for context-aware speech generation and voice cloning. Version 2.0.2 requires Python >=3.10. It leverages a causal transformer trained on continuous speech representations, enabling expressive and cloned voice outputs without discrete tokens. The library is under active development by OpenBMB.","status":"active","version":"2.0.2","language":"python","source_language":"en","source_url":"https://github.com/OpenBMB/VoxCPM.git","tags":["text-to-speech","voice-cloning","deep-learning","tokenizer-free"],"install":[{"cmd":"pip install voxcpm","lang":"bash","label":"Install from PyPI"},{"cmd":"pip install git+https://github.com/OpenBMB/VoxCPM.git","lang":"bash","label":"Install from GitHub"}],"dependencies":[{"reason":"Required for model inference","package":"torch","optional":false},{"reason":"Audio file I/O","package":"soundfile","optional":false},{"reason":"Audio processing","package":"librosa","optional":false},{"reason":"For model architecture (optional if using custom pipeline)","package":"transformers","optional":true}],"imports":[{"note":"Importing the module directly does not expose the class; use the correct submodule import.","wrong":"import voxcpm","symbol":"VoxCPM","correct":"from voxcpm import VoxCPM"}],"quickstart":{"code":"from voxcpm import VoxCPM\nimport soundfile as sf\n\nmodel = VoxCPM()\nwaveform, sr = model.synthesize(\"Hello, this is a test of voice cloning.\", voice_clone=\"path/to/ref_audio.wav\")\nsf.write(\"output.wav\", waveform, sr)","lang":"python","description":"Load the VoxCPM model, generate speech with optional voice cloning from a reference audio file, and save the output."},"warnings":[{"fix":"Ensure you provide a file path string to `voice_clone`.","message":"The `voice_clone` parameter expects a file path to a WAV file. Passing a numpy array or audio buffer will raise a TypeError.","severity":"gotcha","affected_versions":">=2.0.0"},{"fix":"Reduce batch size or use smaller model variants if available.","message":"The model requires significant GPU memory. On a 16GB GPU, batch inference may cause OOM errors.","severity":"gotcha","affected_versions":"all"},{"fix":"Specify `model_path='default'` or a custom path to future-proof your code.","message":"The `voxcpm.VoxCPM` initialization without explicit `model_path` argument downloads the default model, which is deprecated in favor of explicit model selection.","severity":"deprecated","affected_versions":">=2.0.0"}],"env_vars":null,"last_verified":"2026-05-09T00:00:00.000Z","next_check":"2026-08-07T00:00:00.000Z","problems":[{"fix":"Reinstall the package and ensure a stable internet connection. Clear the cache: `rm -rf ~/.cache/voxcpm` and retry.","cause":"The underlying model download failed or was interrupted, leaving the model object as None.","error":"TypeError: 'NoneType' object is not callable"},{"fix":"Reduce batch size, use a smaller model (if available), or run on CPU by setting `device='cpu'`.","cause":"Insufficient GPU memory for the model or batch.","error":"RuntimeError: CUDA out of memory. Tried to allocate ... MiB"},{"fix":"Verify the file path and ensure it points to a valid WAV file.","cause":"The voice cloning reference file path is incorrect or the file does not exist.","error":"FileNotFoundError: [Errno 2] No such file or directory: 'path/to/ref_audio.wav'"}],"ecosystem":"pypi","meta_description":null,"install_score":null,"install_tag":null,"quickstart_score":null,"quickstart_tag":null}