hmmlearn
hmmlearn is a Python library for unsupervised learning of Hidden Markov Models (HMMs) with an API designed to be compatible with scikit-learn. It includes implementations for Gaussian HMMs, Multinomial HMMs, and GMM-HMMs. The current version is 0.3.3, and it receives updates for bug fixes and compatibility, though major feature releases are infrequent.
Warnings
- breaking The `MultinomialHMM` class from version 0.2.0 onwards requires integer-valued observations (e.g., `[0, 1, 2, 1]`). Prior to 0.2.0, it could implicitly handle float-like inputs by converting them. Passing non-integer data to `MultinomialHMM` in current versions will raise an error.
- gotcha The expected input data shape for `MultinomialHMM` can be confusing. For a single sequence of observations, it expects a 1D array of integers `(n_samples,)` if `n_features` is effectively 1, or `(n_samples, n_features)` where each `n_features` element is an integer, rather than a 2D array of floats `(n_samples, n_features)` as is common for `GaussianHMM`.
- gotcha HMM fitting with the Expectation-Maximization (EM) algorithm can converge to local optima rather than the global optimum. This can lead to suboptimal model parameters and poor performance.
- gotcha The `init_params` and `params` arguments in the model's constructor control which parameters are initialized automatically and which are estimated during `fit`. Misunderstanding their usage can lead to errors or parameters not being estimated as expected.
Install
-
pip install hmmlearn
Imports
- GaussianHMM
from hmmlearn import hmm model = hmm.GaussianHMM(...)
- MultinomialHMM
from hmmlearn import hmm model = hmm.MultinomialHMM(...)
- GMMHMM
from hmmlearn import hmm model = hmm.GMMHMM(...)
Quickstart
import numpy as np
from hmmlearn import hmm
# 1. Generate some sample data
# Let's create data that has two underlying 'states' with different means
# The HMM will try to discover these.
np.random.seed(42)
X = np.concatenate([np.random.randn(100, 1) + 0,
np.random.randn(100, 1) + 5])
# Mix the data to simulate a sequence, important for HMMs
np.random.shuffle(X)
# 2. Create and train a Gaussian HMM model
# n_components: number of hidden states we want to find
# covariance_type: "full", "tied", "diag", "spherical"
# n_iter: number of EM iterations to perform
model = hmm.GaussianHMM(n_components=2, covariance_type="full", n_iter=100, random_state=42)
# The 'fit' method estimates parameters from data using the EM algorithm.
# If initial parameters are not set, 'hmmlearn' will use k-means to initialize.
model.fit(X)
# 3. Predict the hidden states for the observed data
hidden_states = model.predict(X)
print("Learned Means:\n", model.means_)
print("Learned Transition Matrix:\n", model.transmat_)
print("First 10 hidden states:\n", hidden_states[:10])
# 4. Score the model (log-likelihood of the data given the model)
score = model.score(X)
print("\nModel score (log-likelihood):", score)
# 5. Generate new samples from the learned model
X_new, Z_new = model.sample(n_samples=50)
print("\nShape of new samples:", X_new.shape)
print("First 5 new samples:\n", X_new[:5].T)
print("First 5 new hidden states:\n", Z_new[:5])