DCTorch: Discrete Cosine Transforms for PyTorch
DCTorch is a Python library providing fast discrete cosine transform (DCT) and inverse discrete cosine transform (IDCT) implementations optimized for PyTorch tensors. It enables efficient frequency domain analysis and manipulation within deep learning models, supporting 2D and 3D transforms. The current version is 0.1.2, and it appears to be actively maintained with recent commits.
Common errors
-
RuntimeError: Expected object of scalar type Float but got scalar type Long for argument 'input'
cause The input tensor provided to a dctorch function has an integer data type (e.g., `torch.long`), but dctorch expects floating-point numbers.fixConvert the input tensor to a floating-point type: `x = x.to(torch.float32)` or `x = x.float()`. -
NotImplementedError: The operator 'aten::fft_rfft' is not currently implemented for the MPS device.
cause dctorch internally relies on PyTorch's FFT operations, which may not be fully supported on all PyTorch devices (e.g., Apple's MPS backend for certain FFT functions).fixRun the operation on a supported device (CPU or CUDA). Move your tensor to a different device: `tensor = tensor.to('cpu')` or `tensor = tensor.to('cuda')`. -
AttributeError: module 'dctorch' has no attribute 'dct'
cause The user is trying to call a generic `dct` function, but `dctorch` provides specific functions for 2D (`dct_2d`) and 3D (`dct_3d`) transforms.fixUse the correct function name: `dctorch.dct_2d()` for 2D transforms or `dctorch.dct_3d()` for 3D transforms.
Warnings
- gotcha Performance heavily depends on PyTorch's underlying CUDA/CPU implementation. Ensure PyTorch is correctly installed and configured for your hardware for optimal speed, especially with large tensors.
- gotcha DCTorch functions assume input tensors are floating-point types (e.g., `torch.float32`, `torch.float64`). Passing integer tensors will result in type errors or unexpected behavior.
- gotcha While DCT and IDCT are often used with square inputs in traditional signal processing, `dctorch` functions `dct_2d` and `idct_2d` will work with rectangular inputs. However, ensure the spatial dimensions (height, width) are consistent for corresponding DCT and IDCT operations to correctly reconstruct the original signal.
Install
-
pip install dctorch
Imports
- dct_2d
from dctorch import dct_2d
- idct_2d
from dctorch import idct_2d
- dct_3d
from dctorch import dct_3d
- idct_3d
from dctorch import idct_3d
Quickstart
import torch
from dctorch import dct_2d, idct_2d
# Determine device (CPU or CUDA if available)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")
# Create a random tensor (Batch size, Channels, Height, Width)
x = torch.randn(1, 3, 224, 224, device=device)
print(f"Original tensor shape: {x.shape}")
# Perform 2D Discrete Cosine Transform
y = dct_2d(x)
print(f"DCT transformed tensor shape: {y.shape}")
# Perform Inverse 2D Discrete Cosine Transform
x_recon = idct_2d(y)
print(f"Reconstructed tensor shape: {x_recon.shape}")
# Verify reconstruction accuracy
reconstruction_error = torch.norm(x - x_recon).item()
print(f"Reconstruction error (L2 norm): {reconstruction_error:.6f}")
# Note: Error is typically very small due to floating point precision
assert reconstruction_error < 1e-4