Python bindings for NVSHMEM

0.3.0 · active · verified Sat Apr 11

NVSHMEM4Py is the official Python language binding for NVSHMEM, a high-performance parallel programming interface based on OpenSHMEM. It provides a Pythonic interface to NVSHMEM's functionality, enabling applications to leverage the Partitioned Global Address Space (PGAS) programming model for efficient multi-GPU and multi-node communication. Key features include seamless integration with NumPy, CuPy, and PyTorch, symmetric memory management, and support for one-sided communication operations (put/get, collectives, atomics) and synchronization primitives. The library `nvshmem4py-cu13` specifically targets CUDA 13.x. The project demonstrates a healthy version release cadence, with the latest version 0.3.0 released in March 2026.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates basic initialization, querying the Processing Element (PE) ID and total number of PEs, and finalization of the NVSHMEM environment. NVSHMEM is a multi-process library, so applications typically need to be launched with an MPI runner (e.g., `mpirun`).

import nvshmem.core as nvshmem
import os

def main():
    # Initialize NVSHMEM environment
    if not nvshmem.is_initialized():
        nvshmem.init()

    # Query current Processing Element (PE) ID and total number of PEs
    my_pe = nvshmem.my_pe()
    n_pes = nvshmem.n_pes()

    print(f"Hello from PE {my_pe} of {n_pes}!")

    # Finalize NVSHMEM environment
    nvshmem.finalize()

if __name__ == "__main__":
    # This example must be launched with an MPI runner, e.g.:
    # mpirun -np 2 python your_script_name.py
    main()

view raw JSON →