VSA - Video Sparse Attention Kernel
raw JSON → 0.1.7 verified Sat May 09 auth: no python
VSA (Video Sparse Attention) is a CUDA kernel for efficient sparse attention in video diffusion models, part of the FastVideo library. Current version 0.0.5 (PyPI) and 0.1.7 (GitHub release). Development is active with frequent releases. Requires Python >=3.10 and CUDA toolkit.
pip install vsa Common errors
error ModuleNotFoundError: No module named 'vsa' ↓
cause VSA not installed or installed incorrectly.
fix
Run: pip install vsa
error RuntimeError: CUDA error: no kernel image is available for execution on the device ↓
cause VSA kernel compiled for a different CUDA architecture than the GPU supports.
fix
Set environment variable TORCH_CUDA_ARCH_LIST before installation, e.g., export TORCH_CUDA_ARCH_LIST="8.0" for Ampere.
error ImportError: libcuda.so: cannot open shared object file ↓
cause CUDA driver library not found.
fix
Install NVIDIA drivers and ensure LD_LIBRARY_PATH includes CUDA library path.
Warnings
breaking PyPI version 0.0.5 is outdated and may have API incompatibilities with the latest GitHub releases. ↓
fix Install directly from GitHub or wait for a new PyPI release: pip install git+https://github.com/hao-ai-lab/FastVideo.git#subdirectory=csrc/attn/video_sparse_attn
gotcha VSA requires a CUDA-compatible GPU and the CUDA Toolkit to be installed. Without it, import will fail. ↓
fix Ensure nvcc is in PATH and torch is CUDA-enabled.
deprecated The 'v0' code paths were removed in release v0.1.2. If you rely on any v0 features, upgrade carefully. ↓
fix Update your code to use the new API if present, or stay on v0.0.5.
Imports
- vsa wrong
from video_sparse_attn import ...correctimport vsa
Quickstart
import torch
import vsa
# Create query, key, value tensors (batch, heads, seq_len, dim)
q = torch.randn(1, 8, 1024, 64, device='cuda')
k = torch.randn(1, 8, 1024, 64, device='cuda')
v = torch.randn(1, 8, 1024, 64, device='cuda')
# Sparse attention using VSA
output = vsa.video_sparse_attn(q, k, v)
print(output.shape)