Coqui TTS Trainer
Coqui TTS Trainer is a general-purpose model trainer for PyTorch, designed to be flexible and extensible. It's part of the wider Coqui AI ecosystem, providing core training utilities for various deep learning models, including those for Text-to-Speech. The current version is 0.4.0, with releases occurring on an irregular, feature-driven cadence.
Common errors
-
ModuleNotFoundError: No module named 'trainer'
cause The Trainer class is located in a submodule `trainer.trainer`, not directly under `coqui_tts_trainer`.fixChange your import statement to `from trainer.trainer import Trainer`. -
TypeError: __init__ missing 1 required positional argument: 'config'
cause The `Trainer` class requires several mandatory arguments during initialization, including a `config` object.fixEnsure you are passing all required arguments, such as `config`, `model`, `optimizer`, `criterion`, `data_loader_train`, and `data_loader_eval` to the `Trainer` constructor. -
RuntimeError: Expected all tensors to be on the same device, but found tensors on both cpu and cuda:0
cause This is a common PyTorch error indicating that your model or data tensors are on different devices (CPU vs. GPU) without proper handling.fixMove your model to the desired device (`model.to(device)`) and ensure all input tensors from your data loader are also moved to the same device (e.g., `data.to(device)`) before passing them to the model.
Warnings
- breaking Version 0.4.0 introduced backwards-incompatible changes by removing various unused functions and arguments to streamline the code. While primarily affecting internal APIs, direct use of previously available utility functions may break.
- breaking Python 3.9 support was dropped starting from version 0.2.1. The library now requires Python >=3.10 and <3.15.
- breaking In version 0.3.0, the Coqui's custom LR schedulers adopted the standard PyTorch scheduler interface. This change, along with fixes to model/scheduler state restoration, might break custom scheduler implementations or existing checkpoint loading logic.
- gotcha As of v0.2.0, `coqui-tts-trainer` switched to using a forked version of the `coqpit` library (from `idiap/coqui-ai-coqpit`). If you have other projects or an older `coqpit` installed, ensure compatibility or use a virtual environment.
- gotcha Starting from v0.3.3, `numpy` and `soundfile` were removed from the *core* dependencies. While still available via `[cpu]` and `[cuda]` extras, their absence from a minimal install might cause `ModuleNotFoundError` if you implicitly relied on them.
Install
-
pip install coqui-tts-trainer -
pip install coqui-tts-trainer[cpu] -
pip install coqui-tts-trainer[cuda] -
pip install coqui-tts-trainer[wandb,tensorboard]
Imports
- Trainer
from coqui_tts_trainer import Trainer
from trainer.trainer import Trainer
- TrainerConfig
from trainer.generic_model_config import TrainerConfig
Quickstart
import torch
import torch.nn as nn
from torch.utils.data import DataLoader, Dataset
from trainer.trainer import Trainer
from trainer.generic_model_config import TrainerConfig
# 1. Define a simple model
class SimpleModel(nn.Module):
def __init__(self):
super().__init__()
self.linear = nn.Linear(10, 2)
def forward(self, x):
return self.linear(x)
# 2. Define a dummy dataset
class DummyDataset(Dataset):
def __len__(self):
return 100
def __getitem__(self, idx):
return torch.randn(10), torch.randint(0, 2, ())
# 3. Create a TrainerConfig
config = TrainerConfig()
config.num_epochs = 2
config.output_path = "./trainer_output"
config.batch_size = 4
# 4. Instantiate model, optimizer, criterion, dataloaders
model = SimpleModel()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=1, gamma=0.1)
criterion = nn.CrossEntropyLoss()
train_dataset = DummyDataset()
eval_dataset = DummyDataset()
train_loader = DataLoader(train_dataset, batch_size=config.batch_size)
eval_loader = DataLoader(eval_dataset, batch_size=config.batch_size)
# 5. Initialize and run the Trainer
trainer = Trainer(
config=config,
model=model,
optimizer=optimizer,
scheduler=scheduler,
criterion=criterion,
data_loader_train=train_loader,
data_loader_eval=eval_loader,
grad_scaler=None, # For mixed precision, can be torch.cuda.amp.GradScaler()
output_path=config.output_path
)
trainer.train()