{"id":9366,"library":"torchtnt","title":"Torchtnt","description":"Torchtnt is a lightweight library by PyTorch providing training tools and utilities. It is closely integrated with PyTorch and designed for rapid iteration with any model or training regimen. It offers powerful dataloading, logging, and visualization utilities. As of version 0.2.4, it is actively maintained by PyTorch and released as needed. It's currently in a pre-alpha development stage, indicating potential API instability. [9, 13]","status":"active","version":"0.2.4","language":"en","source_language":"en","source_url":"https://github.com/pytorch/tnt","tags":["pytorch","training","utilities","mlops","deep-learning"],"install":[{"cmd":"pip install torchtnt","lang":"bash","label":"PyPI"},{"cmd":"conda install -c conda-forge torchtnt","lang":"bash","label":"Conda"}],"dependencies":[{"reason":"Core PyTorch functionality is required for Torchtnt's operation.","package":"torch","optional":false}],"imports":[{"symbol":"AutoUnit","correct":"from torchtnt.framework.auto_unit import AutoUnit"},{"symbol":"fit","correct":"from torchtnt.framework.fit import fit"},{"symbol":"TensorBoardLogger","correct":"from torchtnt.utils.loggers import TensorBoardLogger"},{"symbol":"init_from_env","correct":"from torchtnt.utils import init_from_env"}],"quickstart":{"code":"import torch\nimport torch.nn as nn\nfrom torch.utils.data import Dataset, DataLoader, TensorDataset\nfrom torchtnt.framework.auto_unit import AutoUnit\nfrom torchtnt.framework.fit import fit\nfrom torchtnt.utils import init_from_env, seed\nfrom torchtnt.utils.loggers import TensorBoardLogger\nimport logging\nimport os\n\nlogging.basicConfig(level=logging.INFO)\n\n# 1. Define your model\nclass SimpleModel(nn.Module):\n    def __init__(self, input_dim, output_dim):\n        super().__init__()\n        self.linear = nn.Linear(input_dim, output_dim)\n\n    def forward(self, x):\n        return self.linear(x)\n\n# 2. Define your training unit\nclass MyTrainingUnit(AutoUnit):\n    def __init__(self, model: nn.Module, optimizer: torch.optim.Optimizer, logger: TensorBoardLogger):\n        super().__init__()\n        self.model = model\n        self.optimizer = optimizer\n        self.loss_fn = nn.MSELoss()\n        self.logger = logger\n\n    def train_step(self, state: object, data: tuple[torch.Tensor, torch.Tensor]) -> None:\n        inputs, targets = data\n        outputs = self.model(inputs)\n        loss = self.loss_fn(outputs, targets)\n\n        self.optimizer.zero_grad()\n        loss.backward()\n        self.optimizer.step()\n\n        self.logger.log_scalar(\"train_loss\", loss.item(), step=self.train_progress.num_steps_completed)\n\n# 3. Prepare data\nclass RandomDataset(Dataset):\n    def __init__(self, num_samples, input_dim, output_dim):\n        self.data = torch.randn(num_samples, input_dim)\n        self.labels = torch.randn(num_samples, output_dim)\n\n    def __len__(self):\n        return len(self.data)\n\n    def __getitem__(self, idx):\n        return self.data[idx], self.labels[idx]\n\ninput_dim = 10\noutput_dim = 1\nnum_samples = 1000\nbatch_size = 32\nnum_epochs = 2\n\ndataset = RandomDataset(num_samples, input_dim, output_dim)\ndataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)\n\n# 4. Initialize model, optimizer, and logger\nmodel = SimpleModel(input_dim, output_dim)\noptimizer = torch.optim.SGD(model.parameters(), lr=0.01)\n\nlog_dir = os.path.join(os.environ.get('TORCHTNT_LOG_DIR', './runs'), 'my_experiment')\nos.makedirs(log_dir, exist_ok=True)\nlogger = TensorBoardLogger(log_dir)\n\n# 5. Create training unit and run fit\ntraining_unit = MyTrainingUnit(model, optimizer, logger)\n\nprint(f\"Starting training for {num_epochs} epochs...\")\nfit(training_unit, train_dataloader=dataloader, max_epochs=num_epochs)\nprint(\"Training complete! Check logs in the 'runs' directory.\")","lang":"python","description":"This quickstart demonstrates a basic training loop using Torchtnt's `AutoUnit` and `fit` function. It defines a simple linear model, creates a custom training unit that handles the forward and backward passes, prepares a dummy dataset, and then executes the training. Metrics are logged using `TensorBoardLogger` to the specified log directory. [6]"},"warnings":[{"fix":"Users should expect API instability and closely monitor release notes for breaking changes. Pinning to exact patch versions is recommended for production environments. Regularly consult the official GitHub repository for the latest API usage. [13]","message":"Torchtnt is currently in '2 - Pre-Alpha' development status according to its PyPI classifiers. This signifies that the API is highly experimental and subject to frequent and significant changes, potentially without strict backward compatibility guarantees between minor or even patch versions.","severity":"breaking","affected_versions":"All versions up to 0.2.4"},{"fix":"Always ensure PyTorch is installed first, following the official PyTorch installation instructions for your specific system and desired compute platform (e.g., CUDA version). Visit https://pytorch.org/get-started/locally/ for the correct command. [1, 9]","message":"Torchtnt is built on PyTorch, and a proper PyTorch installation is a prerequisite. Issues can arise if PyTorch is not installed correctly or if there are version incompatibilities (especially with CUDA-enabled builds).","severity":"gotcha","affected_versions":"All versions"},{"fix":"For 'RuntimeError: Expected all tensors to be on the same device', ensure all tensors and modules are explicitly moved to the same device (e.g., `model.to(device)`, `tensor.to(device)`). For 'RuntimeError: size mismatch' or 'Incorrect input shape', carefully print the `.shape` of all involved tensors and use `.view()`, `.reshape()`, or `.permute().contiguous()` to align dimensions. [1, 4, 5, 6]","message":"As Torchtnt operates with PyTorch tensors and modules, common PyTorch runtime errors such as device mismatches, shape mismatches, and datatype errors directly apply. These are frequent sources of frustration for PyTorch developers.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Install the library using `pip install torchtnt` or `conda install -c conda-forge torchtnt`. If using a virtual environment, ensure it is activated. [9]","cause":"The 'torchtnt' package is not installed in the active Python environment, or the environment is not correctly activated.","error":"ModuleNotFoundError: No module named 'torchtnt'"},{"fix":"Identify all tensors and modules participating in the operation and explicitly move them to the same device using `.to(device)`, where `device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')`. [4, 6]","cause":"An operation was attempted between PyTorch tensors or models residing on different compute devices (e.g., one on CPU and another on GPU).","error":"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu"},{"fix":"Print the `.shape` attribute of all tensors involved in the problematic operation. Reshape tensors using methods like `.view()`, `.reshape()`, or `.permute().contiguous()` to ensure their dimensions are compatible with the operation or layer. [1, 4, 5]","cause":"The dimensions of input tensors do not align with the expected input dimensions of a layer or an operation, commonly seen in matrix multiplication or feeding data to linear layers.","error":"RuntimeError: size mismatch, m1: [X, Y], m2: [A, B] (or similar shape errors)"}]}