TorchCodec
TorchCodec is a Python library developed by PyTorch that provides efficient video and audio decoding and encoding capabilities, tightly integrated with PyTorch tensors. It aims to simplify video and audio processing for machine learning workflows, supporting both CPU and GPU operations. The current version is 0.11.0, with new releases occurring approximately every 1-2 months.
Warnings
- gotcha TorchCodec versions are often tied to specific PyTorch versions. Installing an incompatible PyTorch version can lead to runtime errors or unexpected behavior.
- gotcha When using the 'beta' CUDA backend (available from v0.8.0 onwards) on certain systems, a hard dependency on `libnvcuvid.so` might cause import issues if the library is not present.
- gotcha Users on Windows might encounter issues where FFmpeg cannot be found by TorchCodec, even if installed.
- gotcha For GPU decoding on Windows, `pip install` might not always provide the necessary CUDA-enabled binaries directly. The `conda-forge` channel often provides a more stable setup for GPU support.
- gotcha The faster CUDA decoder backend, introduced as 'beta' in v0.8.0, needs to be explicitly enabled for improved performance.
Install
-
pip install torchcodec -
conda install torchcodec -c conda-forge
Imports
- VideoDecoder
from torchcodec.decoders import VideoDecoder
- AudioDecoder
from torchcodec.decoders import AudioDecoder
- VideoEncoder
from torchcodec.encoders import VideoEncoder
- AudioEncoder
from torchcodec.encoders import AudioEncoder
- transforms
from torchcodec.transforms import Resize, RandomCrop
- set_cuda_backend
from torchcodec.decoders import set_cuda_backend
Quickstart
import torchcodec
from torchcodec.decoders import VideoDecoder
import os
# Replace 'path/to/your/video.mp4' with the actual path to your video file.
# For demonstration, we use an environment variable or a placeholder.
video_path = os.environ.get('TORCHCODEC_DEMO_VIDEO', 'path/to/your/video.mp4')
try:
decoder = VideoDecoder(video_path)
print(f"VideoDecoder initialized for: {video_path}")
print(f"Video width: {decoder.width}")
print(f"Video height: {decoder.height}")
# Uncomment the following lines to decode frames:
# frames = decoder.next_chunk(num_frames=10)
# print(f"Decoded 10 frames with shape: {frames.shape}")
except FileNotFoundError:
print(f"Error: Video file not found at '{video_path}'. Please replace with a valid path.")
except Exception as e:
print(f"An error occurred during decoder initialization or usage: {e}")
print("\nTo decode frames, ensure a valid video_path is provided and uncomment the 'frames = decoder.next_chunk...' line.")