Kornia
Kornia is an open-source differentiable computer vision library built on PyTorch, providing a rich set of differentiable image processing and geometric vision algorithms. It seamlessly integrates into AI workflows for tasks like image transformations, augmentations, and AI-driven image processing. Currently at version 0.8.2, Kornia is actively maintained with regular updates and is shifting towards end-to-end vision models.
Warnings
- gotcha Kornia's keypoint and bounding box operations, while using float coordinates, often assume an underlying pixel-center (index-based) convention (e.g., pixel (3,4) center is at (3.5, 4.5)). This can lead to slight inaccuracies or off-by-one errors if precise sub-pixel float coordinates are expected, potentially affecting model accuracy or geometric transformations.
- gotcha While `pip install kornia` is the primary installation, functionalities like `kornia.io.load_image` (for robust image I/O) internally rely on `kornia_rs`. If you encounter errors related to image loading, ensure `kornia_rs` is also installed (e.g., `pip install kornia kornia_rs`).
- gotcha When using `kornia.augmentation` for differentiable data augmentation, distinguish between `torch.nn.Parameter` (for parameters that should be optimized/differentiated) and `torch.Tensor` (for static parameters). Incorrect usage can lead to unintended optimization behavior in meta-learning or differentiable augmentation pipelines.
- breaking Kornia continuously evolves. Older versions of PyTorch might not be compatible with the latest Kornia releases. Specifically, Kornia 0.8.x requires PyTorch >=2.0.0.
Install
-
pip install kornia
Imports
- kornia
import kornia as K
- AugmentationSequential
from kornia.augmentation import AugmentationSequential
- rgb_to_grayscale
from kornia.color import rgb_to_grayscale
Quickstart
import torch
import kornia as K
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
# Simulate loading an image or create a dummy tensor
# In a real scenario, replace this with actual image loading (e.g., with Pillow/OpenCV)
# and conversion using K.image_to_tensor.
# Example with a dummy tensor:
image_tensor = torch.rand(1, 3, 256, 256, dtype=torch.float32)
# Convert image to grayscale
grayscale_tensor = K.color.rgb_to_grayscale(image_tensor)
# Apply a Gaussian blur
blurred_tensor = K.filters.gaussian_blur2d(grayscale_tensor, kernel_size=(7, 7), sigma=(1.5, 1.5))
print(f"Original image shape: {image_tensor.shape}")
print(f"Grayscale image shape: {grayscale_tensor.shape}")
print(f"Blurred image shape: {blurred_tensor.shape}")
# --- Optional: Visualization (requires matplotlib and numpy) ---
# Convert tensors to NumPy arrays for display
# You might need to detach from GPU and move to CPU if applicable:
# grayscale_np = K.tensor_to_image(grayscale_tensor.detach().cpu())
# blurred_np = K.tensor_to_image(blurred_tensor.detach().cpu())
# fig, axs = plt.subplots(1, 2, figsize=(10, 5))
# axs[0].imshow(grayscale_np)
# axs[0].set_title('Grayscale Image')
# axs[0].axis('off')
# axs[1].imshow(blurred_np)
# axs[1].set_title('Blurred Image')
# axs[1].axis('off')
# plt.show()