NVIDIA GPU Tools
nvgpu is a Python library providing tools for interacting with NVIDIA GPUs, offering functionalities to list GPUs, retrieve detailed information, and monitor their status. It acts as a user-friendly wrapper around the lower-level pynvml library. The current version is 0.10.0, and it maintains an active development pace with several releases per year.
Common errors
-
ModuleNotFoundError: No module named 'nvgpu'
cause The `nvgpu` library has not been installed in the current Python environment.fixInstall the package using pip: `pip install nvgpu`. -
pynvml.nvml.NVMLError_DriverNotLoaded: Driver Not Loaded
cause The NVIDIA driver is not loaded, or the NVIDIA Management Library (NVML) cannot be accessed. This typically means no NVIDIA GPU is detected, drivers are not installed, are corrupted, or the `nvidia-smi` service is not running.fixEnsure NVIDIA drivers are correctly installed and up-to-date for your GPU. Verify `nvidia-smi` works from your terminal. Rebooting the system can sometimes resolve driver issues. -
NVMLError: NVML Shared Library Not Found
cause The NVML shared library (e.g., `libnvidia-ml.so` on Linux, `nvml.dll` on Windows) cannot be located by `pynvml`. This often happens in environments without proper `LD_LIBRARY_PATH` configuration or if drivers are partially installed.fixConfirm NVIDIA drivers are fully installed. Ensure the directory containing the NVML shared library is in your system's library path. For Docker, ensure the container has access to GPU devices and drivers (e.g., using `--gpus all` or proper NVIDIA Container Toolkit setup).
Warnings
- gotcha nvgpu relies on the NVIDIA Management Library (NVML), which in turn requires NVIDIA GPU drivers to be correctly installed and the `nvidia-smi` utility to be functional. If drivers are missing, corrupted, or not properly initialized, nvgpu functions will fail with NVML errors.
- gotcha nvgpu explicitly pins its `pynvml` dependency to a specific version (e.g., `pynvml==11.5.0` for nvgpu 0.10.0). Installing an incompatible or significantly different `pynvml` version in the same Python environment might lead to conflicts or unexpected behavior, even if `pip` handles the primary dependency correctly.
Install
-
pip install nvgpu
Imports
- nvgpu
import nvgpu
- list_gpus
import nvgpu gpus = nvgpu.list_gpus()
- gpu_info
import nvgpu info = nvgpu.gpu_info(gpu_id=0)
Quickstart
import nvgpu
import os
# Check if NVIDIA GPU and drivers are likely available
# This is a basic check, actual NVML errors will still occur if setup is bad
if os.path.exists('/dev/nvidia0') or os.environ.get('CUDA_VISIBLE_DEVICES') is not None:
try:
# Get a list of all GPUs and their basic info
gpus = nvgpu.list_gpus()
print("Detected GPUs:")
if gpus:
for gpu_id, gpu_data in gpus.items():
print(f" GPU {gpu_id}:")
for key, value in gpu_data.items():
print(f" {key}: {value}")
# Get detailed info for a specific GPU (e.g., the first one)
first_gpu_id = list(gpus.keys())[0]
detailed_info = nvgpu.gpu_info(gpu_id=first_gpu_id)
print(f"\nDetailed info for GPU {first_gpu_id}:")
for key, value in detailed_info.items():
print(f" {key}: {value}")
else:
print("No NVIDIA GPUs detected or NVML could not be initialized.")
except Exception as e:
print(f"An error occurred: {e}")
print("Please ensure NVIDIA drivers are installed and nvidia-smi works.")
else:
print("No NVIDIA GPUs detected or environment not configured for GPUs. Skipping nvgpu operations.")
print("Ensure NVIDIA drivers are installed and CUDA_VISIBLE_DEVICES is set if in a restricted environment.")