RMM - RAPIDS Memory Manager for CUDA 12

26.4.0 · active · verified Thu Apr 16

RMM (RAPIDS Memory Manager) is a C++ and Python library for efficient GPU memory management. It provides a highly optimized allocation and deallocation framework tailored for NVIDIA GPUs, often used within the RAPIDS ecosystem to improve performance of data science workloads. The current version is 26.4.0, and it generally follows a monthly release cadence.

Common errors

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to initialize a `PoolMemoryResource` and set it as the current RMM device memory resource. It then allocates a `DeviceBuffer` using this configured resource.

import rmm
from rmm.mr import PoolMemoryResource, CudaMemoryResource, set_current_device_resource

# Create an upstream resource (e.g., CudaMemoryResource) for the pool
upstream = CudaMemoryResource()

# Create a PoolMemoryResource with an initial size and an optional maximum size
initial_pool_size = 128 * 1024 * 1024  # 128 MiB
maximum_pool_size = 1024 * 1024 * 1024  # 1 GiB

pool_mr = PoolMemoryResource(
    upstream=upstream,
    initial_pool_size=initial_pool_size,
    maximum_pool_size=maximum_pool_size
)

# Set the default RMM memory resource for the current device
set_current_device_resource(pool_mr)

print(f"RMM current device resource: {rmm.mr.get_current_device_resource()}")

# Allocate a DeviceBuffer using the default RMM memory resource
# This buffer resides on the GPU
buffer_size = 64 * 1024 * 1024 # 64 MiB
device_buffer = rmm.DeviceBuffer(size=buffer_size)

print(f"Successfully allocated rmm.DeviceBuffer of {device_buffer.size / (1024*1024):.2f} MB on GPU.")

# Memory is automatically freed when device_buffer goes out of scope or program exits.

view raw JSON →