{"id":8564,"library":"pytorch","title":"PyTorch","description":"PyTorch is an open-source machine learning framework that accelerates the path from research prototyping to production deployment. It provides two high-level features: Tensor computation (like NumPy) with strong GPU acceleration, and a deep neural network library built on a tape-based autograd system. The `pytorch` PyPI meta-package (current version 2.2.2) provides a convenient way to install the core `torch`, `torchvision`, and `torchaudio` libraries. PyTorch has frequent updates, typically releasing major stable versions multiple times a year, with minor patch releases in between.","status":"active","version":"2.2.2","language":"en","source_language":"en","source_url":"https://github.com/pytorch/pytorch","tags":["machine-learning","deep-learning","neural-networks","gpu-acceleration","tensor-computing","ai"],"install":[{"cmd":"pip install pytorch","lang":"bash","label":"Install CPU version of PyTorch"}],"dependencies":[{"reason":"Core PyTorch library, installed by 'pytorch' meta-package.","package":"torch","optional":false},{"reason":"Computer vision library for PyTorch, installed by 'pytorch' meta-package.","package":"torchvision","optional":false},{"reason":"Audio processing library for PyTorch, installed by 'pytorch' meta-package.","package":"torchaudio","optional":false},{"reason":"Often used for data preprocessing and interoperability, implicitly relied upon.","package":"numpy","optional":true}],"imports":[{"symbol":"torch","correct":"import torch"},{"symbol":"nn","correct":"import torch.nn as nn"},{"symbol":"optim","correct":"import torch.optim as optim"},{"symbol":"DataLoader","correct":"from torch.utils.data import DataLoader, TensorDataset"}],"quickstart":{"code":"import torch\nimport torch.nn as nn\nimport torch.optim as optim\nfrom torch.utils.data import TensorDataset, DataLoader\n\n# 1. Prepare Data\nx_data = torch.randn(100, 1)\ny_data = 2 * x_data + 1 + torch.randn(100, 1) * 0.1 # y = 2x + 1 + noise\n\n# Create a Dataset and DataLoader\ndataset = TensorDataset(x_data, y_data)\ndataloader = DataLoader(dataset, batch_size=10, shuffle=True)\n\n# 2. Define Model\nclass LinearRegression(nn.Module):\n    def __init__(self):\n        super(LinearRegression, self).__init__()\n        self.linear = nn.Linear(1, 1) # One input feature, one output feature\n\n    def forward(self, x):\n        return self.linear(x)\n\nmodel = LinearRegression()\n\n# 3. Define Loss and Optimizer\ncriterion = nn.MSELoss() # Mean Squared Error Loss\noptimizer = optim.SGD(model.parameters(), lr=0.01) # Stochastic Gradient Descent\n\n# 4. Train the Model\nnum_epochs = 100\nfor epoch in range(num_epochs):\n    for batch_x, batch_y in dataloader:\n        # Forward pass\n        outputs = model(batch_x)\n        loss = criterion(outputs, batch_y)\n\n        # Backward and optimize\n        optimizer.zero_grad()\n        loss.backward()\n        optimizer.step()\n\n    if (epoch+1) % 10 == 0:\n        print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')\n\n# 5. Make Predictions\npredicted_value = model(torch.tensor([[5.0]]))\nprint(f\"\\nPredicted value for x=5.0: {predicted_value.item():.4f}\")\nprint(f\"Learned parameters: Weight={model.linear.weight.item():.4f}, Bias={model.linear.bias.item():.4f}\")","lang":"python","description":"This quickstart demonstrates a simple linear regression model in PyTorch. It covers defining a dataset and dataloader, creating a neural network module, setting up a loss function and optimizer, and running a basic training loop. It uses randomly generated data for a simple y = 2x + 1 relationship."},"warnings":[{"fix":"Use `torch.Tensor` directly. All tensors automatically track history if `requires_grad=True`.","message":"The `torch.autograd.Variable` class was deprecated and is now effectively an alias for `torch.Tensor`. Direct tensor operations now support autograd automatically.","severity":"breaking","affected_versions":"<0.4.0 (breaking), 0.4.0-1.x (deprecated), 2.x+ (removed/alias)"},{"fix":"For inference or operations where gradients are not needed, use the `with torch.no_grad():` context manager.","message":"The `volatile=True` argument for tensors was deprecated and removed. It was used to signal that computations in a graph should not track gradients (e.g., during inference).","severity":"deprecated","affected_versions":"<0.4.0 (used), 0.4.0-1.x (deprecated), 2.x+ (removed)"},{"fix":"Always use `tensor.item()` to extract a Python scalar from a single-element tensor.","message":"Extracting a scalar value from a single-element tensor using `float(tensor)` or `int(tensor)` will raise a runtime error if the tensor has more than one element.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Consult the official PyTorch website (pytorch.org/get-started/locally/) to obtain the correct `pip` command for your specific operating system, CUDA version, and Python environment.","message":"When installing PyTorch, the `pip install pytorch` command provides the CPU version by default. For GPU acceleration, specific installation instructions are required, typically involving `pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cuXXx` where `cuXXx` specifies your CUDA version.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Define a `device` variable (e.g., `device = 'cuda' if torch.cuda.is_available() else 'cpu'`) and consistently use `tensor.to(device)` or `model.to(device)` for all relevant components.","message":"Moving models or tensors between CPU and GPU devices. Using `.cuda()` on tensors/models will move them to the default GPU, but `.to(device)` is more flexible.","severity":"gotcha","affected_versions":"All versions"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Reduce batch size, decrease model complexity, use `torch.no_grad()` for inference, or call `torch.cuda.empty_cache()` (though often not sufficient alone if the core issue is size).","cause":"The GPU ran out of memory, usually due to a batch size that is too large, a model with too many parameters, or not releasing unused tensors.","error":"RuntimeError: CUDA out of memory."},{"fix":"Ensure all tensors involved in an operation are on the same device. Use `tensor.to(device)` to move tensors, where `device` is typically 'cpu' or 'cuda'.","cause":"Operations attempting to combine tensors located on different devices (e.g., one on CPU, another on GPU).","error":"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!"},{"fix":"Inspect the `.shape` of all involved tensors. Use `tensor.view()`, `tensor.permute()`, `tensor.squeeze()`, `tensor.unsqueeze()`, or ensure model input/output dimensions are correct.","cause":"Shape mismatch between tensors, often in loss calculations, concatenation, or model input/output.","error":"RuntimeError: The size of tensor a (X) must match the size of tensor b (Y) at non-singleton dimension Z"},{"fix":"Convert tensor data types explicitly using `tensor.to(torch.float32)` or `tensor.long()`, `tensor.double()`, etc. Ensure `DataLoader` outputs the correct types.","cause":"A tensor operation or module expects a specific data type (e.g., `torch.float32`) but receives a different one (e.g., `torch.int64`). Common with loss functions or model inputs.","error":"RuntimeError: expected scalar type Float but found Long"}]}