wavelet-matrix

raw JSON →
2.1.8 verified Fri May 01 auth: no python

A high-performance indexed sequence data structure powered by Rust, supporting fast rank/select and range queries. Current version 2.1.8, requires Python >=3.9. Released under MIT license; maintained by wm-team.

pip install wavelet-matrix
error AttributeError: module 'wavelet_matrix' has no attribute 'WaveletMatrix'
cause Importing using the package name with hyphen instead of underscore.
fix
Use from wavelet_matrix import WaveletMatrix (note underscore).
error TypeError: WaveletMatrix.__init__() got multiple values for argument 'data'
cause Passing the old second argument (alphabet_size) to constructor.
fix
Call WaveletMatrix(data) without alphabet size. The class now infers alphabet automatically.
error ValueError: Rank query index out of range
cause Passing an index >= length of the sequence.
fix
Ensure index < len(wavelet_matrix). Rank indices are 0-based up to n-1.
breaking In version 2.0, the constructor argument order changed from `data, alphabet_size` to `data` only, with automatic alphabet detection. Code using `WaveletMatrix(data, 256)` will raise TypeError.
fix Remove the second argument from constructor: `WaveletMatrix(data)`.
gotcha Rank and select methods are 0-indexed. `rank(value, index)` returns count of `value` up to and including the position `index`. `select(value, k)` returns the 0-based position of the kth occurrence (k is 0-based).
fix Ensure indices are 0-based. For 1-based indexing, adjust by -1.
gotcha The library only supports non-negative integers up to 2^31-1. Negative numbers or floats will raise TypeError or overflow.
fix Map your data to non-negative integers before constructing the WaveletMatrix.

Basic usage: create a WaveletMatrix from a list of integers, then use rank, select, and quantile methods.

from wavelet_matrix import WaveletMatrix
data = [5, 3, 1, 4, 2]
wm = WaveletMatrix(data)
print(wm.rank(3, 2))  # rank of value 3 up to index 2
print(wm.select(1, 1))  # position of the 1st occurrence of value 1
print(wm.quantile(0, 4, 2))  # 2nd smallest in range [0,4]