Trainer (Coqui-AI)

0.0.36 · active · verified Fri Apr 17

Trainer by Coqui-AI is a general-purpose model trainer for PyTorch, designed to be flexible for various deep learning tasks. It wraps common training patterns, including distributed training via Hugging Face Accelerate, making it suitable for quick experimentation and larger-scale projects. The library is in active development (v0.0.36) with frequent micro-releases addressing bugs and adding features.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to set up a minimal PyTorch model, optimizer, criterion, and data loaders, then initialize and run the `Trainer` class for a basic training loop. It uses dummy data and a simple linear model to illustrate the core workflow. The `config` dictionary is essential for guiding the trainer's behavior, including output paths and training epochs.

import torch
from torch import nn, optim
from torch.utils.data import DataLoader, Dataset
from trainer import Trainer
import os

# 1. Dummy Dataset
class DummyDataset(Dataset):
    def __init__(self, num_samples=100, input_dim=10, output_dim=1):
        self.X = torch.randn(num_samples, input_dim)
        self.y = torch.randn(num_samples, output_dim)
    def __len__(self):
        return len(self.X)
    def __getitem__(self, idx):
        return self.X[idx], self.y[idx]

# 2. Dummy Model
class DummyModel(nn.Module):
    def __init__(self, input_dim=10, output_dim=1):
        super().__init__()
        self.linear = nn.Linear(input_dim, output_dim)
    def forward(self, x):
        return self.linear(x)

# 3. Setup components
input_dim = 10
output_dim = 1
model = DummyModel(input_dim, output_dim)
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.MSELoss()

train_dataset = DummyDataset(num_samples=100, input_dim=input_dim, output_dim=output_dim)
eval_dataset = DummyDataset(num_samples=20, input_dim=input_dim, output_dim=output_dim)

dataloader_train = DataLoader(train_dataset, batch_size=4, shuffle=True)
dataloader_eval = DataLoader(eval_dataset, batch_size=4, shuffle=False)

# 4. Minimal Config (usually from argparse)
config = {
    "output_path": "./trainer_quickstart_output",
    "epochs": 2,
    "start_by_epochs": True, 
    "print_step": 1, 
    "save_step": 1, 
    "eval_step": 1
}

# Ensure output path exists for trainer to save checkpoints/logs
os.makedirs(config["output_path"], exist_ok=True)

# 5. Initialize and run Trainer
trainer_instance = Trainer(
    config=config,
    model=model,
    optimizer=optimizer,
    criterion=criterion,
    dataloader_train=dataloader_train,
    dataloader_eval=dataloader_eval,
)

print(f"Starting training for {config['epochs']} epochs...")
trainer_instance.train_loop()
print("Training finished.")

# Output files will be created in ./trainer_quickstart_output
# In a real application, you might add cleanup or more complex logging.

view raw JSON →