UMAP (Uniform Manifold Approximation and Projection)

0.5.12 · active · verified Thu Apr 09

UMAP (Uniform Manifold Approximation and Projection) is a general-purpose manifold learning and dimensionality reduction algorithm. It constructs a high-dimensional graph and then searches for a low-dimensional projection of the data that has the closest possible equivalent fuzzy topological structure. The current version is 0.5.12, with a release cadence that includes frequent patch releases and minor updates.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `umap-learn` to reduce the dimensionality of a synthetic dataset. It covers generating data, initializing the `UMAP` reducer with common parameters, and performing the fit and transform operation.

import umap
from sklearn.datasets import make_blobs

# 1. Generate some sample data
X, y = make_blobs(n_samples=500, centers=4, cluster_std=1.0, random_state=42)

# 2. Initialize UMAP reducer
# n_neighbors: Balances local vs. global structure. Larger values preserve more global structure.
# min_dist: Controls how tightly points are packed together. Smaller values lead to denser clusters.
# n_components: Desired dimensionality of the output embedding.
# random_state: For reproducible results.
reducer = umap.UMAP(n_neighbors=15, min_dist=0.1, n_components=2, random_state=42)

# 3. Fit and transform the data
embedding = reducer.fit_transform(X)

# The 'embedding' now contains the 2D projection of the original data
print(f"Original data shape: {X.shape}")
print(f"UMAP embedding shape: {embedding.shape}")
# print(embedding[:5]) # Display first 5 embedded points

view raw JSON →