Captum

0.8.0 · active · verified Tue Apr 14

Captum is an open-source model interpretability and understanding library for PyTorch. It provides a comprehensive suite of attribution algorithms to explain predictions of deep learning models, assess the importance of layers and neurons, and evaluate model robustness and concept influence. The library, currently at version 0.8.0, maintains an active development cycle with regular feature additions and improvements.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to apply Integrated Gradients, a popular feature attribution method, to a simple PyTorch `ToyModel`. It covers defining the model, preparing inputs and baselines, instantiating an attribution algorithm, and computing feature attributions.

import torch
import torch.nn as nn
from captum.attr import IntegratedGradients

# 1. Define a simple PyTorch model
class ToyModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.lin1 = nn.Linear(3, 3)
        self.relu = nn.ReLU()
        self.lin2 = nn.Linear(3, 2)

    def forward(self, input):
        return self.lin2(self.relu(self.lin1(input)))

model = ToyModel()
model.eval() # Set model to evaluation mode

# 2. Define input and baseline tensors
input_tensor = torch.rand(2, 3, requires_grad=True)
baseline_tensor = torch.zeros(2, 3)

# 3. Instantiate an attribution algorithm (e.g., Integrated Gradients)
ig = IntegratedGradients(model)

# 4. Compute attributions
# target specifies the output index to explain (e.g., target=0 for the first output class)
attributions, delta = ig.attribute(input_tensor, baseline_tensor, target=0, return_convergence_delta=True)

print('Input Tensor:', input_tensor)
print('IG Attributions:', attributions)
print('Convergence Delta:', delta)

view raw JSON →