harmonypy

raw JSON →
2.0.0 verified Fri May 01 auth: no python

A Python port of the Harmony algorithm for batch correction of single-cell RNA-seq data, featuring a C++ Armadillo backend for high performance. Current version 2.0.0 is a complete rewrite that matches the R harmony2 package with ~10x speed improvement over v0.1.0. Pre-built wheels are available for Linux (x86_64, aarch64) and macOS. Releases are intermittent, with major rewrites at v0.2.0 (PyTorch) and v2.0.0 (C++ Armadillo).

pip install harmonypy
error ModuleNotFoundError: No module named 'harmonypy'
cause The package is not installed or the Python environment is incorrect.
fix
Run pip install harmonypy==2.0.0 and ensure you are using Python >=3.9.
error AttributeError: module 'harmonypy' has no attribute 'run_harmony'
cause You are using an older version of harmonypy (v0.1.0 or v0.2.0) where the function was located under a different import path or not available.
fix
Upgrade to v2.0.0: pip install harmonypy>=2. Then use from harmonypy import run_harmony.
error harmonypy.harmony.Harmony object has no attribute 'Z_corr'
cause Using old API expecting the old attribute name (e.g., `Z_corrected`) or wrong class.
fix
In v2.0.0, the HarmonyResult object has a Z_corr attribute. Refer to the README for current API.
error ValueError: The number of cells in meta does not match the number of rows in Z
cause The `meta` array or DataFrame does not have the same length as the number of cells in the PCA matrix.
fix
Verify that meta has exactly the same number of rows as Z (cells).
breaking v2.0.0 is a complete rewrite with a C++ Armadillo backend and nanobind. It does not support PyTorch (v0.2.0) or the pure Python/NumPy path (v0.1.0). All existing code using older versions must be updated.
fix Update imports: `from harmonypy import run_harmony`. The API is similar but may have minor differences; refer to the README.
deprecated v0.2.0 (PyTorch backend) is no longer maintained. Pre-built wheels are only available for v2.0.0 and later.
fix Upgrade to v2.0.0: `pip install harmonypy>=2`.
gotcha random_state parameter: In v2.0.0, it is passed to run_harmony() as a keyword argument. Older versions used a different mechanism or ignored it.
fix Set random_state when calling run_harmony to ensure reproducibility.
gotcha The output `ho.Z_corr` in v2.0.0 is a NumPy array; in v0.1.0 it was a list of lists. Check array dimensions: (cells, PCs).
fix Use `np.array(ho.Z_corr)` if needed for compatibility.

Basic harmony batch correction on simulated PCA embeddings.

import numpy as np
import harmonypy

# Simulate data
np.random.seed(0)
Z = np.random.randn(500, 20)  # 500 cells, 20 PCs
meta = np.array([0]*250 + [1]*250)  # two batches

# Run harmony
ho = harmonypy.run_harmony(Z, meta, ['batch'], max_iter_harmony=10, random_state=0)
corrected = ho.Z_corr
print(corrected.shape)