NVTX Python Bindings
NVTX (NVIDIA Tools Extension Library) is a cross-platform API for annotating source code to provide contextual information to developer tools like NVIDIA Nsight Systems. The `nvtx` Python library provides native Python wrappers for a subset of the NVTX C API, enabling Python developers to mark events and define code ranges for profiling and visualization of CPU and GPU activities. The current Python package version is 0.2.15, with active development tied to the broader NVTX v3.x.x core library releases.
Warnings
- gotcha In `nvtx` versions prior to 0.2.13 (or NVTX core v3.2.2/v3.3.0), decorator ranges might not have ended correctly if an exception was thrown within the decorated function, potentially leading to incomplete or misleading profiles.
- gotcha To disable NVTX annotations at runtime and reduce overhead, set the `NVTX_DISABLE` environment variable before launching your application. This can be crucial during performance-critical 'warmup' phases or when profiling is not desired.
- gotcha When using Python's `multiprocessing` module on Linux with NVTX instrumentation, the default 'fork' start method can lead to issues with Nsight Systems' process injection. It is recommended to explicitly use the 'spawn' start method for correct profiling.
- gotcha Using automatic function annotation (e.g., via command-line interface or `nvtx.Profile` class) can introduce significant overhead (more than 10x) due to annotating every function invocation. Use it cautiously and prefer manual annotation for critical paths.
- gotcha NVTX Domains are computationally expensive to create and should be used sparingly (e.g., one per library). For finer-grained grouping of annotations within a domain, use Categories, which are less expensive.
Install
-
pip install nvtx -
conda install -c conda-forge nvtx
Imports
- nvtx
import nvtx
- annotate
@nvtx.annotate()
- mark
nvtx.mark(message="Event")
Quickstart
import time
import nvtx
import os
# Define a function to be annotated
@nvtx.annotate(color="blue")
def my_function():
for i in range(os.environ.get('NVTX_ITERATIONS', 2)):
with nvtx.annotate(f"my_loop_iteration_{i}", color="red"):
time.sleep(0.1)
if __name__ == "__main__":
print("Running annotated code...")
my_function()
print("Code execution complete. Profile with NVIDIA Nsight Systems.")
# To profile, run from your terminal:
# nsys profile -t nvtx python your_script_name.py
# Then open the generated .qdrep file in Nsight Systems GUI.