{"id":8908,"library":"coqui-tts","title":"Coqui Text-to-Speech (TTS)","description":"Coqui TTS is a deep learning library for advanced Text-to-Speech synthesis, supporting a wide range of models and languages. It enables tasks like voice cloning, multi-speaker TTS, and emotional speech generation. The current version is 0.27.5, and the library maintains an active development pace with frequent patch and minor releases.","status":"active","version":"0.27.5","language":"en","source_language":"en","source_url":"https://github.com/idiap/coqui-ai-TTS","tags":["text-to-speech","tts","audio","deep-learning","ai","voice-cloning","speech-synthesis"],"install":[{"cmd":"pip install coqui-tts\npip install torch==2.3.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cpu","lang":"bash","label":"Basic CPU installation (post v0.27.4)"},{"cmd":"pip install coqui-tts\npip install torch==2.3.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu121","lang":"bash","label":"CUDA 12.1 installation (post v0.27.4)"},{"cmd":"pip install 'coqui-tts[all]'","lang":"bash","label":"Install all optional dependencies (pre v0.27.4, or if you manage PyTorch manually)"}],"dependencies":[{"reason":"Core deep learning framework. Not installed by default since v0.27.4.","package":"torch","optional":false},{"reason":"Audio processing library for PyTorch. Not installed by default since v0.27.4.","package":"torchaudio","optional":false},{"reason":"Used for various models, requires careful version management for compatibility.","package":"transformers","optional":false}],"imports":[{"note":"The primary API is exposed through TTS.api, not lower-level internal modules.","wrong":"from coqui_tts.tts import TTS","symbol":"TTS","correct":"from TTS.api import TTS"}],"quickstart":{"code":"from TTS.api import TTS\nimport os\n\n# This will download and load a default VITS model into memory.\n# For GPU, set gpu=True. Ensure appropriate PyTorch version is installed.\n# If you encounter issues, try a different model_name, e.g., 'tts_models/multilingual/multi-dataset/xtts_v2'\n# For XTTS, you'd also need a speaker_wav file.\n# For simpler, single-speaker models, speaker_wav is often optional.\n\ntts = TTS(model_name=\"tts_models/en/ljspeech/vits\", progress_bar=True, gpu=False)\n\n# Synthesize speech to a file.\n# Replace 'output.wav' with your desired output path.\noutput_file = \"coqui_output.wav\"\ntts.tts_to_file(\n    text=\"Hello from Coqui TTS, the ultimate text-to-speech library!\",\n    file_path=output_file,\n    language=\"en\"\n)\n\nprint(f\"Speech saved to {os.path.abspath(output_file)}\")\n","lang":"python","description":"This quickstart demonstrates how to initialize a Coqui TTS model and synthesize speech to an audio file. It uses a default VITS model for English. Remember to install PyTorch and Torchaudio manually if using coqui-tts versions 0.27.4 or newer. Adjust `gpu=True` for GPU acceleration if available and configured."},"warnings":[{"fix":"After `pip install coqui-tts`, manually install PyTorch and Torchaudio, e.g., `pip install torch==X.Y.Z torchaudio==X.Y.Z --index-url https://download.pytorch.org/whl/cuXXX` (for CUDA) or `.../whl/cpu` (for CPU). Refer to PyTorch's official installation instructions for specific versions.","message":"Starting from v0.27.4, `coqui-tts` no longer installs `torch`, `torchaudio`, and `torchcodec` by default. Users must install these PyTorch dependencies manually to match their system (CPU/CUDA) and desired PyTorch version.","severity":"breaking","affected_versions":">=0.27.4"},{"fix":"Switch to the new speaker caching mechanism (consult official documentation for updated cloning details). Avoid using `speaker_id` and prefer `speaker_wav` or other model-specific parameters for speaker selection.","message":"The old caching mechanism for Bark and Tortoise models has been removed. Additionally, the `speaker_id` argument in the `synthesize()` method is deprecated.","severity":"breaking","affected_versions":">=0.27.0"},{"fix":"Check the Coqui TTS release notes or GitHub issues for recommended `transformers` versions. For example, v0.26.2 restricted transformers to `<4.52`, while v0.27.5 fixed issues with `transformers>=5`. If you encounter issues, try pinning `transformers` to a known working version.","message":"Compatibility issues can arise with the `transformers` library, leading to incorrect output or inference errors. Specific `transformers` versions might be required for certain `coqui-tts` releases.","severity":"gotcha","affected_versions":"All versions, notably around 0.26.x and 0.27.x"},{"fix":"Ensure your Python environment is within the supported range. Create a new virtual environment with a compatible Python version if necessary (e.g., Python 3.10, 3.11, 3.12, 3.13).","message":"Coqui TTS has strict Python version requirements, currently `<3.15,>=3.10`.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install PyTorch and Torchaudio explicitly for your system. E.g., `pip install torch==2.3.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cpu` (for CPU) or replace `cpu` with your CUDA version (e.g., `cu121`).","cause":"Since `coqui-tts` v0.27.4, PyTorch and related dependencies are no longer automatically installed, requiring manual installation.","error":"ModuleNotFoundError: No module named 'torch'"},{"fix":"Refactor your code to use the `speaker_wav` argument or other model-specific parameters for speaker selection, as `speaker_id` is no longer the recommended approach.","cause":"The `speaker_id` argument for `synthesize()` was deprecated in v0.27.0 and might be removed or cause errors in newer versions.","error":"TypeError: synthesize() got an unexpected keyword argument 'speaker_id'"},{"fix":"Check the Coqui TTS release notes for the recommended `transformers` version for your `coqui-tts` release. You might need to downgrade or upgrade `transformers`, e.g., `pip install transformers==4.51.0` or `pip install \"transformers>=5.0.0\"`.","cause":"Certain versions of the `transformers` library cause issues with Coqui TTS models, particularly XTTS.","error":"RuntimeError: XTTS inference failed due to transformers version incompatibility. Please ensure transformers is within the supported range."},{"fix":"Verify if the loaded model actually supports multiple speakers via `tts.speakers`. For models like XTTS, you typically provide a `speaker_wav` file directly instead of selecting from an indexed list of speakers. Consult the documentation for your specific model.","cause":"This error often occurs when trying to access `tts.speakers` for models that do not support explicit speaker IDs or have a different interface for speaker management (e.g., XTTS uses `speaker_wav`).","error":"AttributeError: 'TTS' object has no attribute 'speakers'"}]}