RMM - RAPIDS Memory Manager (CUDA 12)
RMM (RAPIDS Memory Manager) provides a C++ library and Python bindings for managing GPU device memory. It offers various memory resources, including pooling allocators, to improve performance and reduce fragmentation for CUDA-enabled applications. The `librmm-cu12` package is specifically built for CUDA 12.x environments. It follows the RAPIDS release cadence, typically releasing new versions monthly.
Common errors
-
ModuleNotFoundError: No module named 'rmm._lib'
cause Attempting to import from the internal `rmm._lib` module which was removed in RMM v25.02.00.fixRemove `from rmm import _lib` or any direct `_lib` imports. Use public RMM APIs like `rmm.device_buffer.DeviceBuffer` or `rmm.mr.CudaMemoryResource` instead. -
RuntimeError: RMM memory resource not initialized
cause Attempting to allocate RMM-managed memory (e.g., with `DeviceBuffer` or `cupy`) without first setting a global RMM memory resource.fixInitialize and set a default RMM memory resource using `rmm.mr.set_current_device_resource(rmm.mr.PoolMemoryResource())` (or another desired resource) before performing any allocations. -
CUDA_ERROR_OUT_OF_MEMORY: out of memory
cause The GPU ran out of memory, potentially due to large allocations, fragmentation, or an insufficient memory pool size if using `PoolMemoryResource`.fixReduce memory usage, optimize data structures, or increase the `initial_pool_size` and `maximum_pool_size` of your `PoolMemoryResource`. Consider using `ManagedMemoryResource` for automatic memory oversubscription (if supported by your GPU and driver) or ensuring proper deallocation. -
RuntimeError: RMM cannot be used with CUDA 11.x. Please upgrade to CUDA 12.x or install an RMM version compatible with CUDA 11.x.
cause Using `librmm-cu12` (or RMM v25.08.00+) with an older CUDA Toolkit version (e.g., 11.x).fixUpgrade your CUDA Toolkit to 12.x and ensure `cupy-cuda12x` and `cuda-python` are correctly installed. Alternatively, downgrade RMM to a version compatible with your CUDA 11.x setup (e.g., `librmm-cu11`).
Warnings
- breaking Starting with RMM v25.08.00, `librmm-cu12` (and other RMM packages) explicitly require CUDA 12.0 or newer. Previous RMM versions might have supported CUDA 11.x.
- breaking The internal `rmm._lib` module was removed in v25.02.00. Direct imports from this path will result in `ModuleNotFoundError`.
- breaking RMM v26.02.00 removed host-only memory resources and deprecated certain legacy memory resource interfaces, standardizing on the CCCL interface. Code using `HostMemoryResource` or older `memory_resource` patterns will break.
- deprecated Direct access to RMM's internal logger (`rmm.logger`) was deprecated in v24.12.00 and fully removed/refactored in v25.02.00/v25.04.00 to use `rapids-logger`.
Install
-
pip install librmm-cu12
Imports
- DeviceBuffer
from rmm.device_buffer import DeviceBuffer
- CudaMemoryResource
from rmm.mr import CudaMemoryResource
- PoolMemoryResource
from rmm.mr import PoolMemoryResource
- set_current_device_resource
from rmm.mr import set_current_device_resource
- rmm
import rmm
- rmm._lib
from rmm import _lib
import rmm
Quickstart
import rmm
from rmm.mr import PoolMemoryResource, set_current_device_resource
from rmm.device_buffer import DeviceBuffer
import cupy as cp
# 1. Configure RMM with a memory resource (e.g., a memory pool)
pool_size_bytes = 2 * 1024**3 # 2 GB
max_pool_size_bytes = 4 * 1024**3 # 4 GB
# Create a PoolMemoryResource
# Note: As of v26.02.00, host-only memory resources were removed.
# This example uses a device-backed pool.
mr = PoolMemoryResource(initial_pool_size=pool_size_bytes, maximum_pool_size=max_pool_size_bytes)
set_current_device_resource(mr)
print(f"Current RMM memory resource set to: {rmm.mr.get_current_device_resource()}")
# 2. Allocate device memory directly with RMM
db = DeviceBuffer(size=1024, dtype='uint8')
print(f"Allocated DeviceBuffer of size {db.size} bytes: {db}")
# 3. Use CuPy with RMM integration (CuPy will automatically use RMM)
a = cp.arange(10**6, dtype=cp.float32)
b = a * 2
print(f"CuPy array created using RMM: {a.shape}, dtype={a.dtype}")
# Clean up (optional, as RMM resources are typically global and managed)
del db
del a, b
# Note: The PoolMemoryResource itself will be deallocated when 'mr' goes out of scope
# or when the program exits, releasing its managed memory.