Faiss for CUDA 12 (faiss-gpu-cu12)
faiss-gpu-cu12 is a Python library for efficient similarity search and clustering of dense vectors, leveraging NVIDIA GPUs with CUDA 12. It provides pre-built wheels that dynamically link to CUDA Runtime and cuBLAS libraries available on PyPI, eliminating the need for a local CUDA installation. This particular package is an unofficial, community-maintained build, offering specialized support for CUDA 12.1 and maintaining minor version compatibility. The current version is 1.14.1.post1, with a release cadence tied to updates in the underlying Faiss library and CUDA versions.
Warnings
- breaking NVIDIA Driver and GPU Architecture Compatibility: This package requires a CUDA-compatible NVIDIA driver and a GPU with Compute Capability 7.0–8.9 (Volta to Ada Lovelace). Older GPUs or incompatible drivers will prevent Faiss from leveraging GPU acceleration.
- gotcha CUDA Version Conflicts with Other Libraries: When integrating `faiss-gpu-cu12` with other CUDA-dependent libraries like PyTorch or TensorFlow in the same environment, ensure they are all linked to the same CUDA 12.x version to avoid runtime conflicts and errors. Different CUDA minor versions may introduce incompatibilities.
- gotcha Unofficial Project Status: The `faiss-gpu-cu12` package is an unofficial, community-maintained project, not directly from Facebook AI Research. This implies potential limitations in comprehensive testing across all NVIDIA GPU architectures and varying levels of support compared to the official Faiss repository.
- breaking Missing sm_90 (Hopper/H100) Kernels for Version 1.13.2: Specifically, `faiss-gpu-cu12==1.13.2` is known to be missing `sm_90` CUDA kernels, causing runtime failures on NVIDIA H100/Hopper GPUs, despite its PyPI description claiming support for compute capability up to 9.0. It only contains `sm_70` and `sm_80` kernels.
Install
-
pip install faiss-gpu-cu12 -
pip install 'faiss-gpu-cu12[fix-cuda]'
Imports
- faiss
import faiss
Quickstart
import faiss
import numpy as np
# 1. Define dataset parameters
d = 128 # dimension
nb = 100000 # database size
nq = 10 # number of queries
# 2. Generate random data
np.random.seed(1234)
xb = np.random.random((nb, d)).astype('float32')
xq = np.random.random((nq, d)).astype('float32')
# Ensure the data is C-contiguous as Faiss often expects it
xb = np.ascontiguousarray(xb)
xq = np.ascontiguousarray(xq)
# 3. Build a CPU index (e.g., L2 distance)
index_cpu = faiss.IndexFlatL2(d)
print(f"Is CPU index trained? {index_cpu.is_trained}")
# 4. Add vectors to the CPU index
index_cpu.add(xb)
print(f"Number of vectors in CPU index: {index_cpu.ntotal}")
# 5. Attempt to move the index to GPU
try:
# Faiss GPU indices require StandardGpuResources
res = faiss.StandardGpuResources()
# 0 for the first GPU; change if you have multiple and want a different one
index_gpu = faiss.index_cpu_to_gpu(res, 0, index_cpu)
print(f"\nIndex successfully moved to GPU. Number of vectors: {index_gpu.ntotal}")
# 6. Perform search on GPU
k = 4 # We want to find 4 nearest neighbors
D, I = index_gpu.search(xq, k) # D for distances, I for indices
print("\nGPU Search - Distances (D):")
print(D)
print("\nGPU Search - Indices (I):")
print(I)
except Exception as e:
print(f"\nCould not run GPU operations. This might happen if no compatible GPU or drivers are found, or if it's a CPU-only environment. Error: {e}")
print("Falling back to CPU search for demonstration:")
k = 4
D_cpu, I_cpu = index_cpu.search(xq, k)
print("\nCPU Search - Distances (D):")
print(D_cpu)
print(
"\nCPU Search - Indices (I):")
print(I_cpu)