{"id":668,"library":"nvidia-nvtx-cu12","title":"NVIDIA Tools Extension (NVTX) Python Binding","description":"NVTX (NVIDIA Tools Extension SDK) is a C-based API with Python wrappers for annotating application code with events, ranges, and resources. These annotations provide contextual information for NVIDIA developer tools like Nsight Systems and Nsight Compute, enabling visual profiling and performance analysis of CPU and GPU activities in Python applications. The `nvidia-nvtx-cu12` package provides bindings specifically for CUDA 12.x environments. It is actively maintained with frequent updates, often tied to CUDA toolkit releases.","status":"active","version":"12.9.79","language":"python","source_language":"en","source_url":"https://github.com/NVIDIA/NVTX","tags":["nvidia","cuda","profiling","nvtx","performance","gpu","developer-tools"],"install":[{"cmd":"pip install nvidia-nvtx-cu12","lang":"bash","label":"Install latest version"}],"dependencies":[{"reason":"Required for NVTX functionality and profiling with NVIDIA tools like Nsight Systems.","package":"NVIDIA CUDA Toolkit 12.x","optional":false}],"imports":[{"symbol":"nvtx","correct":"import nvtx"}],"quickstart":{"code":"import time\nimport nvtx\n\n@nvtx.annotate(\"my_outer_function\", color=\"blue\")\ndef my_function_to_profile():\n    time.sleep(0.05) # Simulate some work\n    with nvtx.annotate(\"inner_loop_work\", color=\"red\"):\n        for i in range(2):\n            time.sleep(0.02) # More work\n            nvtx.mark(f\"Iteration {i} complete\", color=\"green\")\n\nif __name__ == \"__main__\":\n    print(\"Running annotated code...\")\n    my_function_to_profile()\n    print(\"Code finished. To profile this, save as e.g., 'demo.py' and run:\\nnsys profile python demo.py\")\n    print(\"Then open the generated .qdrep file in NVIDIA Nsight Systems for visualization.\")","lang":"python","description":"This example demonstrates how to use `nvtx.annotate` as a decorator for functions and as a context manager for code blocks, and `nvtx.mark` for instantaneous events. The annotated code itself does not directly produce a visible output, but generates profiling data that can be captured and visualized by NVIDIA Nsight Systems."},"warnings":[{"fix":"Before creating any Pool objects or starting new processes, add: `import multiprocessing; multiprocessing.set_start_method(\"spawn\", force=True)`","message":"When using NVTX with Python's `multiprocessing` module on Linux, the default `fork` start method can interfere with Nsight Systems' ability to inject and collect NVTX traces reliably. It is recommended to explicitly set the start method to `spawn`.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Disable `seccomp` restrictions for the profiled application if possible, or use non-injection based profiling features within Nsight Systems.","message":"Nsight Systems trace features, including NVTX collection via process injection, may fail or cause instability in applications that use `seccomp` to restrict system calls. This can lead to process termination or hung applications.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Ensure all components of your application are compiled and linked against a consistent NVTX and CUDA Toolkit version. Recompile dependent libraries if necessary.","message":"Changes in the underlying NVTX C API between major CUDA Toolkit versions (e.g., CUDA 11.x to 12.x) can lead to compilation issues or runtime incompatibilities for other libraries that directly interface with NVTX's C API. While `nvidia-nvtx-cu12` is built for CUDA 12, users integrating multiple components should ensure NVTX version consistency.","severity":"breaking","affected_versions":"Potentially when migrating between CUDA Toolkit major versions (e.g., 11.x to 12.x)"},{"fix":"Use automatic annotation judiciously. For general profiling, prefer manual annotation with `@nvtx.annotate` or `with nvtx.annotate` on critical code sections.","message":"The `nvtx` library offers functionality for automatic annotation of all function calls. However, enabling this feature introduces significant performance overhead (potentially slowing down execution by more than 10x) and should be used cautiously for targeted debugging, not general profiling.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Minimize the number of distinct `nvtx.Domain` objects created. Leverage `category` arguments for detailed event classification within a single domain.","message":"Creating NVTX domains can be a relatively expensive operation. For optimal performance and clearer visualization, it is recommended to create a limited number of domains (e.g., one per major library or subsystem) and use categories for finer-grained grouping of events within those domains.","severity":"gotcha","affected_versions":"All versions"},{"fix":"Install the 'nvtx' Python package. For CUDA-accelerated NVTX, install `nvidia-nvtx-cuXX` (replacing XX with your CUDA major version, e.g., `pip install nvidia-nvtx-cu12`). For a generic CPU-only version, install `nvtx-plugins-py` (`pip install nvtx-plugins-py`).","message":"The `nvtx` Python module is not found, likely because the package has not been installed in the current environment.","severity":"breaking","affected_versions":"All versions"},{"fix":"To install this package, first ensure the NVIDIA Python Package Index is configured by installing `nvidia-pyindex`, then proceed with the package installation:\n```\n$ pip install nvidia-pyindex\n$ pip install nvidia-nvtx-cu12\n```","message":"The `nvidia-nvtx-cu12` package, along with other NVIDIA Python packages, is hosted on the NVIDIA Python Package Index, not directly on PyPI.org. Attempting to install it directly via `pip install nvidia-nvtx-cu12` without configuring the NVIDIA index will result in a 'placeholder project' error, preventing installation.","severity":"breaking","affected_versions":"All versions of `nvidia-nvtx-cu12` and similar NVIDIA packages hosted on the NVIDIA PyPI."}],"env_vars":null,"last_verified":"2026-05-12T17:38:10.148Z","next_check":"2026-06-26T00:00:00.000Z","problems":[{"fix":"Ensure `nvidia-nvtx-cu12` is correctly installed using `pip install nvidia-nvtx-cu12`. If you are using a virtual environment, ensure it's activated.","cause":"The 'nvtx' Python package, which is provided by `nvidia-nvtx-cu12`, is not installed or not accessible in your current Python environment.","error":"ModuleNotFoundError: No module named 'nvtx'"},{"fix":"Verify your CUDA Toolkit installation and ensure the CUDA bin directory (e.g., `C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.4\\bin`) is added to your system's PATH environment variable. On Linux, ensure `LD_LIBRARY_PATH` includes CUDA library paths.","cause":"This error, common on Windows, occurs when Python cannot find a required dynamic link library (DLL) that `nvidia-nvtx-cu12` or its underlying CUDA dependencies rely on. This often points to an incorrect or incomplete CUDA Toolkit installation, or missing CUDA binary paths in the system's PATH environment variable.","error":"ImportError: DLL load failed: The specified module could not be found."},{"fix":"Ensure you have a CUDA-enabled build of PyTorch (or your relevant framework) installed, matching your CUDA toolkit version. For PyTorch, install using `pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121` (adjust `cu121` for your CUDA version).","cause":"This error typically arises when a framework like PyTorch, which is attempting to use NVTX for profiling, detects that the underlying NVTX libraries or a CUDA-enabled build of the framework itself is not correctly configured or installed.","error":"RuntimeError: NVTX functions not installed. Are you sure you have a CUDA build?"},{"fix":"When installing frameworks like PyTorch, use their recommended installation command, which often specifies the exact `nvidia-*-cu12` dependencies. Alternatively, create a fresh virtual environment and install only the necessary packages, carefully checking their compatibility matrix (e.g., PyTorch's website) for matching CUDA versions.","cause":"This common pip message indicates a version mismatch or conflict between `nvidia-nvtx-cu12` (or other `nvidia-*-cu12` packages) and other installed packages, often a specific version of PyTorch or TensorFlow, which require precise versions of NVIDIA's CUDA-related Python wrappers.","error":"ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts."}],"ecosystem":"pypi","meta_description":null,"install_score":0,"install_tag":"stale","quickstart_score":0,"quickstart_tag":"stale","pypi_latest":"12.9.79","install_checks":{"last_tested":"2026-05-12","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-alpine","python_version":"3.10","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.6,"import_time_s":null,"mem_mb":null,"disk_size":"19M"},{"runtime":"python:3.10-slim","python_version":"3.10","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-alpine","python_version":"3.11","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.7,"import_time_s":null,"mem_mb":null,"disk_size":"21M"},{"runtime":"python:3.11-slim","python_version":"3.11","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-alpine","python_version":"3.12","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.4,"import_time_s":null,"mem_mb":null,"disk_size":"12M"},{"runtime":"python:3.12-slim","python_version":"3.12","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-alpine","python_version":"3.13","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.5,"import_time_s":null,"mem_mb":null,"disk_size":"12M"},{"runtime":"python:3.13-slim","python_version":"3.13","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":" $EXIT -eq 0 ","exit_code":1,"wheel_type":null,"failure_reason":"build_error","install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-alpine","python_version":"3.9","os_libc":"alpine (musl)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":" $EXIT -eq 0 ","exit_code":0,"wheel_type":"wheel","failure_reason":null,"install_time_s":1.9,"import_time_s":null,"mem_mb":null,"disk_size":"18M"},{"runtime":"python:3.9-slim","python_version":"3.9","os_libc":"slim (glibc)","variant":"default","exit_code":1,"wheel_type":null,"failure_reason":null,"install_time_s":null,"import_time_s":null,"mem_mb":null,"disk_size":null}]},"quickstart_checks":{"last_tested":"2026-04-24","tag":"stale","tag_description":"widespread failures or data too old to trust","results":[{"runtime":"python:3.10-alpine","exit_code":1},{"runtime":"python:3.10-slim","exit_code":1},{"runtime":"python:3.11-alpine","exit_code":1},{"runtime":"python:3.11-slim","exit_code":1},{"runtime":"python:3.12-alpine","exit_code":1},{"runtime":"python:3.12-slim","exit_code":1},{"runtime":"python:3.13-alpine","exit_code":1},{"runtime":"python:3.13-slim","exit_code":1},{"runtime":"python:3.9-alpine","exit_code":1},{"runtime":"python:3.9-slim","exit_code":1}]}}