Albumentations (Legacy - MIT Licensed)
Albumentations is a Python library for image augmentation, widely adopted in deep learning and computer vision tasks for its speed, flexibility, and extensive collection of transformations. It offers a unified API to work with various data types including images, masks, bounding boxes, and keypoints. **However, the original MIT-licensed Albumentations project is no longer actively maintained. The last update was in June 2025, and no further bug fixes, features, or compatibility updates will be provided.** For active development and support, users are directed to its successor, AlbumentationsX, which maintains the same API but operates under a dual AGPL-3.0 / Commercial license. The current version of this legacy library is 2.0.8.
Warnings
- breaking The original MIT-licensed Albumentations library (this package) is no longer actively maintained. The last update was in June 2025, and no further bug fixes, features, or compatibility updates will be provided. This means it may eventually break with newer Python, PyTorch, or TensorFlow versions.
- gotcha Reproducibility of augmentation sequences requires explicitly setting the `seed` parameter in `A.Compose`. Global seeds (`numpy.random.seed()`, `random.seed()`) do not affect Albumentations' internal random state. Additionally, using the same seed with different `num_workers` settings in a PyTorch `DataLoader` will produce different augmentation sequences.
- gotcha All inputs to Albumentations transforms (images, masks, bounding boxes, keypoints) must be NumPy arrays. Passing Python lists directly is not supported and will result in errors.
- gotcha When loading images using OpenCV (`cv2.imread`), they are typically loaded in BGR color space. Albumentations primarily expects images in RGB format. Passing BGR images directly to RGB-sensitive transforms will produce incorrect colors.
- gotcha Individual transforms in Albumentations typically require grayscale images to have an explicit channel dimension (e.g., shape `(H, W, 1)` instead of `(H, W)`). While `A.Compose` often provides convenience by handling both formats, it's best practice to ensure the channel dimension is present, especially when applying transforms directly.
Install
-
pip install albumentations
Imports
- Compose
from albumentations import Compose
- A
import albumentations as A
Quickstart
import albumentations as A
import cv2
import numpy as np
# Create a dummy image (256x256, 3 channels, uint8)
image = np.random.randint(0, 256, (256, 256, 3), dtype=np.uint8)
# Define an augmentation pipeline
transform = A.Compose([
A.RandomCrop(width=128, height=128, p=1.0),
A.HorizontalFlip(p=0.5),
A.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])
# Apply the transform
transformed_data = transform(image=image)
transformed_image = transformed_data["image"]
print(f"Original image shape: {image.shape}")
print(f"Transformed image shape: {transformed_image.shape}")
print(f"Transformed image dtype: {transformed_image.dtype}")