Sparse n-dimensional arrays
Sparse is a Python library that provides n-dimensional arrays for the PyData ecosystem, optimized for data with a large number of zero or 'fill' values. It aims to offer a drop-in replacement for NumPy arrays with support for N-dimensional operations, following the Array API standard. The current version is 0.18.0, and it maintains an active development cycle with frequent releases.
Warnings
- breaking Automatic densification of sparse arrays into NumPy functions now raises a RuntimeError.
- gotcha The `*` operator performs element-wise multiplication, while `@` performs matrix multiplication (dot product).
- gotcha Operations on `sparse` arrays, especially element-wise functions like `np.where`, propagate the `fill_value` (defaulting to 0). Unexpected results can occur if your data or mask implicitly relies on a non-zero `fill_value` behavior.
- gotcha Directly applying general NumPy functions to `sparse` arrays without explicit conversion or a sparse-aware implementation can lead to inefficient computation or incorrect results.
- gotcha Migration from `scipy.sparse` matrix API (e.g., `csr_matrix`) to the array API (e.g., `csr_array`) involves changes in constructor names, operator behavior (`*` vs `@`), and potential changes in return dimensions.
Install
-
pip install sparse
Imports
- sparse
import sparse
- COO
from sparse import COO
Quickstart
import sparse
import numpy as np
# Create a sparse array from a dictionary of coordinates and values
x = sparse.COO({(0, 0): 1, (1, 2): 2}, shape=(3, 3))
print("Sparse array x:\n", x)
print("Dense representation of x:\n", x.todense())
# Create a sparse array from a NumPy array
y = np.arange(9).reshape((3, 3))
z = sparse.COO.from_numpy(y)
print("Sparse array z from NumPy:\n", z)
# Perform an operation (addition) and convert to dense
print("Dense representation of (x + z):\n", (x + z).todense())