Grad-CAM (PyTorch)

1.5.5 verified Sat May 09 auth: no python

Grad-CAM is a PyTorch library for generating Class Activation Maps (CAMs) for image classification, segmentation, object detection, and more. Current version 1.5.5 supports many CAM methods (GradCAM, GradCAM++, HiResCAM, etc.) and runs on Python >=3.8. Releases are periodic, maintained by the author.

pip install grad-cam

Common errors

error TypeError: GradCAM.__init__() got an unexpected keyword argument 'use_cuda' ↓

cause Version 1.5.0 removed the `use_cuda` parameter and replaced it with `device`.

fix

Replace use_cuda=True with device='cuda'.

error RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 224 but got size 224. ↓

cause Input tensor spatial dimensions don't match the model's expected input size (e.g., ResNet expects 224x224).

fix

Resize the input image to 224x224 before preprocessing, or adapt the model to the input size.

error AttributeError: module 'pytorch_grad_cam' has no attribute 'GradCAM' ↓

cause Importing from the wrong module name; the package is installed as `grad-cam` but imports from `pytorch_grad_cam`.

fix

Use correct import: from pytorch_grad_cam import GradCAM.

error ValueError: target_layers must be a list of nn.Module layers. ↓

cause Passing a single layer instead of a list in versions >=1.5.0.

fix

Wrap the target layer in a list: target_layers=[model.layer4[-1]].

Warnings

breaking In version 1.5.x, the `target_layers` argument must be a list of layer objects, not a single layer. Passing a single layer will raise an error. ↓

fix Always wrap target layers in a list: e.g., `target_layers=[model.layer4[-1]]`.

deprecated The `use_cuda` parameter is deprecated in favor of `device` parameter. Using `use_cuda=True` may still work but raises a warning. ↓

fix Use `device='cuda'` instead of `use_cuda=True`.

gotcha Import path is `pytorch_grad_cam`, not `grad_cam` or `gradcam`. The PyPI package name is `grad-cam`. ↓

fix Use `from pytorch_grad_cam import ...`.

gotcha The `input_tensor` parameter expects a tensor with batch dimension. If you pass a single image tensor without batch, you'll get unexpected shape errors. ↓

fix Ensure input tensor has shape (1, C, H, W). Use `input_tensor.unsqueeze(0)` if needed.

gotcha For classification models, the default `model.eval()` must be called before CAM generation; otherwise batch norm/dropout layers produce wrong gradients. ↓

fix Always call `model.eval()` and use `torch.no_grad()` (outside CAM generation).

Imports

GradCAM
wrong
```
from grad_cam import GradCAM
```
correct
```
from pytorch_grad_cam import GradCAM
```
The package name is 'grad-cam' but the import module is 'pytorch_grad_cam'.

GradCAMPlusPlus

from pytorch_grad_cam import GradCAMPlusPlus

HiResCAM
```
from pytorch_grad_cam import HiResCAM
```
ScoreCAM
```
from pytorch_grad_cam import ScoreCAM
```
LayerCAM
```
from pytorch_grad_cam import LayerCAM
```

utils

wrong

from pytorch_grad_cam import show_cam_on_image

correct

from pytorch_grad_cam.utils.image import show_cam_on_image, preprocess_image

Utility functions are in 'pytorch_grad_cam.utils.image' submodule.

Quickstart

Demonstrates loading a pretrained ResNet50, creating a GradCAM object with the final convolutional layer, and generating a CAM visualization on an input image.

import torch
import torchvision
import cv2
import numpy as np
from pytorch_grad_cam import GradCAM
from pytorch_grad_cam.utils.image import show_cam_on_image, preprocess_image

def get_cam(model, image_path, target_layer):
    # Load and preprocess image
    image = cv2.imread(image_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    image_resized = cv2.resize(image, (224, 224))
    image_normalized = image_resized.astype(np.float32) / 255.0
    input_tensor = preprocess_image(image_resized, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    
    # Create CAM object
    cam = GradCAM(model=model, target_layers=[target_layer])
    
    # Generate CAM mask
    grayscale_cam = cam(input_tensor=input_tensor)[0, :]
    visualization = show_cam_on_image(image_normalized, grayscale_cam, use_rgb=True)
    return visualization

# Example usage:
model = torchvision.models.resnet50(pretrained=True).eval()
target_layer = model.layer4[-1]
vis = get_cam(model, 'path/to/image.jpg', target_layer)