RAPIDS cuML

26.4.0 · active · verified Thu Apr 16

RAPIDS cuML (CUDA-accelerated Machine Learning) is a suite of GPU-accelerated machine learning libraries and algorithms designed to be fully compatible with scikit-learn APIs, enabling users to transition seamlessly from CPU to GPU without significant code changes. It's part of the broader RAPIDS ecosystem for data science, optimized for CUDA 12. The current version is 26.4.0, following a monthly release cadence aligned with the RAPIDS project.

Common errors

Warnings

Install

Imports

Quickstart

This example demonstrates how to perform k-means clustering using cuML. It generates synthetic data with scikit-learn, converts it to a cuDF DataFrame for GPU processing, and then fits a KMeans model to find clusters. It requires `cudf` and `scikit-learn`.

import cuml
import cudf
from sklearn.datasets import make_blobs

# Generate synthetic data on CPU
X, _ = make_blobs(n_samples=1000, n_features=10, centers=5, random_state=42)

# Convert to cuDF DataFrame for GPU processing
X_gdf = cudf.DataFrame(X)

# Initialize and fit a cuML KMeans model
kmeans = cuml.cluster.KMeans(n_clusters=5, random_state=42)
kmeans.fit(X_gdf)

# Predict cluster labels
labels = kmeans.predict(X_gdf)

print("Cluster labels (first 5):\n", labels.head())
print("Cluster centers (first 5 rows):\n", kmeans.cluster_centers_.head())

view raw JSON →