{"id":7638,"library":"pytorch-revgrad","title":"PyTorch Gradient Reversal Layer","description":"pytorch-revgrad is a minimalist PyTorch package that provides a gradient reversal layer (GRL) as both a module and a function. This layer is commonly used in domain adaptation techniques, such as Domain-Adversarial Neural Networks (DANN), to encourage feature extractors to learn domain-invariant representations by reversing the gradient signal for a subsequent domain classifier. The current version, `0.2.0`, was released in January 2021, and the library maintains a low release cadence, indicating stability for its core functionality.","status":"maintenance","version":"0.2.0","language":"en","source_language":"en","source_url":"https://github.com/janfreyberg/pytorch-revgrad","tags":["pytorch","gradient-reversal","domain-adaptation","machine-learning","adversarial-learning"],"install":[{"cmd":"pip install pytorch-revgrad","lang":"bash","label":"PyPI"}],"dependencies":[{"reason":"This library is a PyTorch module/function and requires PyTorch for all operations.","package":"torch","optional":false}],"imports":[{"symbol":"RevGrad","correct":"from pytorch_revgrad import RevGrad"}],"quickstart":{"code":"import torch\nfrom torch import nn\nfrom pytorch_revgrad import RevGrad\n\n# Define a simple feature extractor\nclass FeatureExtractor(nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.fc1 = nn.Linear(10, 5)\n\n    def forward(self, x):\n        return torch.relu(self.fc1(x))\n\n# Define a domain classifier with a RevGrad layer\nclass DomainClassifier(nn.Module):\n    def __init__(self):\n        super().__init__()\n        self.revgrad = RevGrad()\n        self.fc1 = nn.Linear(5, 5)\n        self.fc2 = nn.Linear(5, 1)\n\n    def forward(self, x):\n        x = self.revgrad(x)\n        x = torch.relu(self.fc1(x))\n        return torch.sigmoid(self.fc2(x))\n\n# Example usage\nfeature_extractor = FeatureExtractor()\ndomain_classifier = DomainClassifier()\n\ninput_data = torch.randn(64, 10, requires_grad=True)\n\n# Forward pass\nfeatures = feature_extractor(input_data)\ndomain_output = domain_classifier(features)\n\nprint(f\"Input shape: {input_data.shape}\")\nprint(f\"Features shape: {features.shape}\")\nprint(f\"Domain output shape: {domain_output.shape}\")\n\n# Simulate a loss and backward pass (conceptual)\n# In a real scenario, you'd define a combined loss for source and target, \n# and optimize both feature_extractor and domain_classifier.\n# For demonstration, we'll just show a dummy backward pass.\n\ndummy_loss = domain_output.mean()\ndummy_loss.backward()\n\n# Check if gradients are flowing (should be for input_data and features)\nprint(f\"Gradient for input data exists: {input_data.grad is not None}\")\nprint(f\"Gradient for feature_extractor.fc1.weight exists: {feature_extractor.fc1.weight.grad is not None}\")\nprint(f\"Gradient for domain_classifier.fc1.weight exists: {domain_classifier.fc1.weight.grad is not None}\")","lang":"python","description":"This quickstart demonstrates how to integrate `RevGrad` into a simple PyTorch model architecture, typical for domain adaptation. It shows a `FeatureExtractor` and a `DomainClassifier` where `RevGrad` is placed before the classifier's layers to reverse gradients for domain classification."},"warnings":[{"fix":"Ensure the `RevGrad` layer is placed within a sub-network (e.g., a domain classifier) whose parameters are intended to learn from the reversed gradients, and that this sub-network is part of a larger architecture where other parts learn from the normal gradients (e.g., feature extractor).","message":"Placing the `RevGrad` layer directly before a loss function can lead to exploding gradients and `NaN` losses. The layer's purpose is to reverse gradients, so if no other layers follow that learn from these reversed gradients, the loss for that branch may destabilize quickly.","severity":"gotcha","affected_versions":"All"},{"fix":"Be aware of this limitation and potentially exclude `backward` methods of custom autograd functions from coverage reports, or rely on functional correctness tests rather than line-by-line coverage for these specific parts.","message":"When testing custom `torch.autograd.Function` implementations, like `RevGrad`, `coverage.py` might not report coverage for the `backward` method. This is because PyTorch's autograd engine calls the backward pass using C++ internals, which `coverage.py`'s Python tracing cannot detect.","severity":"gotcha","affected_versions":"All"},{"fix":"Carefully manage graph retention and avoid in-place operations on tensors that require gradients unless explicitly designed for. If `loss.backward()` is called multiple times on the same graph, ensure `retain_graph=True` is used for intermediate calls, or recreate the graph where possible.","message":"Similar to other custom PyTorch `autograd.Function` implementations, improper handling of computational graphs (e.g., calling `.backward()` multiple times without `retain_graph=True` when needed, or modifying tensors in-place that are part of the graph) can lead to `RuntimeError`s.","severity":"gotcha","affected_versions":"All"}],"env_vars":null,"last_verified":"2026-04-16T00:00:00.000Z","next_check":"2026-07-15T00:00:00.000Z","problems":[{"fix":"Ensure all relevant tensors and the model are moved to the same device (e.g., `model.to(device)`, `input_data.to(device)`) before computation. This applies to `RevGrad` inputs as well.","cause":"A common PyTorch error indicating that tensors involved in an operation are on different devices (e.g., model on GPU, input data on CPU).","error":"RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0!"},{"fix":"Review the placement of the `RevGrad` layer. It should typically be positioned after a shared feature extractor and before a domain-specific classifier, allowing the feature extractor to learn from both standard and reversed gradients without immediate instability. Adjust learning rates or add gradient clipping if necessary.","cause":"Often due to the `RevGrad` layer being positioned such that it causes exploding gradients for the preceding layers, leading to numerical instability.","error":"Loss becomes NaN during training after a few iterations."},{"fix":"Verify that `requires_grad=True` is set for all tensors whose gradients are needed (e.g., model parameters, or inputs if testing gradient flow). Ensure that operations are not inadvertently enclosed in `torch.no_grad()` if gradients are required for those computations.","cause":"This typically occurs when `.grad` is accessed on a tensor that does not have `requires_grad=True`, or whose computational graph has been detached, or if operations were performed within a `torch.no_grad()` context accidentally affecting the graph.","error":"AttributeError: 'NoneType' object has no attribute 'grad_fn' (or similar errors related to .grad being None)"}]}