{"id":5305,"library":"lpips","title":"LPIPS (Learned Perceptual Image Patch Similarity)","description":"LPIPS is a Python library that implements the Learned Perceptual Image Patch Similarity metric. This metric is designed to measure the similarity between two images in a way that aligns more closely with human perception than traditional metrics like MSE or SSIM. It leverages deep features extracted from pre-trained convolutional neural networks (like AlexNet, VGG, or SqueezeNet). The library is currently at version 0.1.4 and is primarily maintained through its GitHub repository, with releases tied to significant updates. It's often used in image generation and restoration tasks to evaluate perceptual quality or as a perceptual loss function.","status":"active","version":"0.1.4","language":"en","source_language":"en","source_url":"https://github.com/richzhang/PerceptualSimilarity","tags":["image-similarity","perceptual-metric","deep-learning","pytorch","computer-vision"],"install":[{"cmd":"pip install lpips","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Core deep learning framework for model execution.","package":"torch","optional":false},{"reason":"Provides pre-trained models and image transformations used by LPIPS.","package":"torchvision","optional":false}],"imports":[{"symbol":"LPIPS","correct":"import lpips\nloss_fn = lpips.LPIPS()"}],"quickstart":{"code":"import torch\nimport lpips\n\n# Ensure PyTorch is set up (e.g., for GPU if available)\n# LPIPS does not typically require API keys for model loading\n\n# Initialize the LPIPS model, using AlexNet as the default backbone\n# 'alex' is recommended for best forward scores, 'vgg' for perceptual loss in optimization\nloss_fn_alex = lpips.LPIPS(net='alex')\n\n# Create two dummy images (batch_size, channels, height, width)\n# IMPORTANT: Images should be RGB (3 channels) and normalized to [-1, 1]\nimg0 = torch.rand(1, 3, 64, 64) * 2 - 1  # Random image 1, normalized to [-1, 1]\nimg1 = torch.rand(1, 3, 64, 64) * 2 - 1  # Random image 2, normalized to [-1, 1]\n\n# Compute the LPIPS distance\nd = loss_fn_alex(img0, img1)\n\nprint(f\"LPIPS distance: {d.item():.4f}\")\n\n# Example with VGG network, often preferred for 'perceptual loss' in training\nloss_fn_vgg = lpips.LPIPS(net='vgg')\nd_vgg = loss_fn_vgg(img0, img1)\nprint(f\"LPIPS distance (VGG): {d_vgg.item():.4f}\")","lang":"python","description":"This quickstart demonstrates how to import the LPIPS library, initialize the `LPIPS` model with different backbone networks ('alex' or 'vgg'), and compute the perceptual distance between two randomly generated PyTorch tensors. It highlights the crucial requirement for input images to be 3-channel RGB and normalized to the range `[-1, 1]`."},"warnings":[{"fix":"For `v0.1` and later, ensure `lpips=True` (which is the default) or upgrade to the latest version. If replicating results from papers using `v0.0`, you might need to manually set `lpips=False` to disable the linear scaling for compatibility.","message":"A bug in the initial 'v0.0' release (before `v0.1.x`) caused inputs not to be scaled, leading to different results compared to the paper. This was fixed in `v0.1` and later versions where linear scaling is enabled by default.","severity":"breaking","affected_versions":"v0.0 (and potentially early `v0.1` if `lpips=False` was explicitly set)."},{"fix":"Always ensure your input tensors have shape `(N, 3, H, W)` and pixel values are scaled to `[-1, 1]`. You can convert `[0, 1]` images to `[-1, 1]` using `img * 2 - 1` and grayscale `(N, 1, H, W)` images can be tiled to `(N, 3, H, W)` if appropriate for your use case.","message":"Input images for LPIPS must be 3-channel RGB PyTorch Tensors and normalized to the range `[-1, 1]`. Incorrect normalization or channel dimensions (e.g., `[0, 1]` range or grayscale) will lead to incorrect or unexpected similarity scores.","severity":"gotcha","affected_versions":"All versions."},{"fix":"When using LPIPS as a loss function in a generative model or similar optimization task, consider initializing `lpips.LPIPS(net='vgg')`. For pure evaluation, `net='alex'` is generally sufficient.","message":"The default network `net='alex'` is optimized for best *forward* scores (evaluating similarity). For use as a 'perceptual loss' in optimization/backpropagation, `net='vgg'` is often recommended as it is closer to traditional perceptual loss functions.","severity":"gotcha","affected_versions":"All versions."},{"fix":"Be aware of this vulnerability when using LPIPS in security-sensitive contexts or for evaluating adversarial robustness. For more robust metrics, consider investigating alternative implementations like E-LPIPS or R-LPIPS if adversarial robustness is a critical concern.","message":"LPIPS models, being based on deep neural networks, can be susceptible to adversarial attacks, meaning small, imperceptible perturbations can significantly alter the LPIPS score, leading to humanly similar images being judged as very different by the metric. Variants like E-LPIPS or R-LPIPS address this but are not part of the core `lpips` package.","severity":"gotcha","affected_versions":"All versions of the core `lpips` library."},{"fix":"Reduce batch size, lower image resolution, or consider using Automatic Mixed Precision (AMP) if supported by your PyTorch setup.","message":"Running LPIPS, especially with the 'vgg' backbone, can consume significant GPU memory. This can lead to out-of-memory errors with larger batch sizes or higher resolution images.","severity":"gotcha","affected_versions":"All versions, particularly with 'vgg' network and large inputs."}],"env_vars":null,"last_verified":"2026-04-13T00:00:00.000Z","next_check":"2026-07-12T00:00:00.000Z"}