PyTorch
Deep learning framework with GPU-accelerated tensor operations. Current version is 2.10.0 (Jan 2026). Install command varies by CUDA version — plain pip install torch gives CPU-only build. torch.load weights_only default changed to True in 2.6, breaking thousands of existing checkpoints. TorchScript deprecated in 2.10.
Warnings
- breaking torch.load() weights_only default changed from False to True in PyTorch 2.6. All existing torch.load() calls without explicit weights_only= will raise UnpicklingError if the checkpoint contains optimizer states, custom classes, or numpy arrays. Broke thousands of projects.
- breaking Plain pip install torch installs CPU-only build. CUDA builds require a custom --index-url. LLM-generated install instructions almost never include this. torch.cuda.is_available() returns False after CPU-only install.
- breaking torchvision, torchaudio version must exactly match torch version. Installing latest torch with mismatched torchvision versions causes ImportError or silent incorrect behavior.
- deprecated TorchScript (torch.jit.script, torch.jit.trace) deprecated in PyTorch 2.10. The PyTorch team recommends migrating to torch.export for model deployment.
- gotcha Forgetting model.eval() during inference causes BatchNorm and Dropout layers to behave as if training — different results each run and incorrect predictions.
- gotcha optimizer.zero_grad() must be called before loss.backward() each step. Forgetting it accumulates gradients across batches — silent training bug.
- gotcha Tensors on different devices cannot be combined. CPU tensor + CUDA tensor raises RuntimeError. Common when target labels stay on CPU while model outputs are on CUDA.
Install
-
pip install torch -
pip install torch --index-url https://download.pytorch.org/whl/cu128 -
pip install torch --index-url https://download.pytorch.org/whl/cu124 -
pip install torch --index-url https://download.pytorch.org/whl/rocm6.2
Imports
- torch.load
# For trusted checkpoints (your own models): model.load_state_dict(torch.load('model.pt', weights_only=True)) # For checkpoints with non-tensor objects (optimizer states, custom classes): checkpoint = torch.load('checkpoint.pt', weights_only=False) # only for trusted files - device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') model = MyModel().to(device) tensor = tensor.to(device)
Quickstart
import torch
import torch.nn as nn
# Device setup
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
# Simple model
model = nn.Sequential(
nn.Linear(10, 64),
nn.ReLU(),
nn.Linear(64, 1)
).to(device)
# Training step
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
loss_fn = nn.MSELoss()
model.train()
for x, y in dataloader:
x, y = x.to(device), y.to(device)
optimizer.zero_grad()
loss = loss_fn(model(x), y)
loss.backward()
optimizer.step()
# Inference
model.eval()
with torch.no_grad():
predictions = model(test_x.to(device))
# Save / load
torch.save(model.state_dict(), 'model.pt')
model.load_state_dict(torch.load('model.pt', weights_only=True))