hmmlearn¶
Unsupervised learning and inference of Hidden Markov Models:
Simple algorithms and models to learn HMMs (Hidden Markov Models) in Python,
Follows scikit-learn API as close as possible, but adapted to sequence data,
Built on scikit-learn, NumPy, SciPy, and Matplotlib,
Open source, commercially usable — BSD license.
User guide: table of contents¶
Tutorial¶
hmmlearn
implements the Hidden Markov Models (HMMs).
The HMM is a generative probabilistic model, in which a sequence of observable
\(\mathbf{X}\) variables is generated by a sequence of internal hidden
states \(\mathbf{Z}\). The hidden states are not observed directly.
The transitions between hidden states are assumed to have the form of a
(first-order) Markov chain. They can be specified by the start probability
vector \(\boldsymbol{\pi}\) and a transition probability matrix
\(\mathbf{A}\). The emission probability of an observable can be any
distribution with parameters \(\boldsymbol{\theta}\) conditioned on the
current hidden state. The HMM is completely determined by
\(\boldsymbol{\pi}\), \(\mathbf{A}\) and \(\boldsymbol{\theta}\).
There are three fundamental problems for HMMs:
Given the model parameters and observed data, estimate the optimal sequence of hidden states.
Given the model parameters and observed data, calculate the model likelihood.
Given just the observed data, estimate the model parameters.
The first and the second problem can be solved by the dynamic programming algorithms known as the Viterbi algorithm and the Forward-Backward algorithm, respectively. The last one can be solved by an iterative Expectation-Maximization (EM) algorithm, known as the Baum-Welch algorithm.
References:
Lawrence R. Rabiner “A tutorial on hidden Markov models and selected applications in speech recognition”, Proceedings of the IEEE 77.2, pp. 257-286, 1989.
Jeff A. Bilmes, “A gentle tutorial of the EM algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models.”, 1998.
Mark Stamp. “A revealing introduction to hidden Markov models”. Tech. rep. Department of Computer Science, San Jose State University, 2018. url: http://www.cs.sjsu.edu/~stamp/RUA/HMM.pdf.
Available models¶
Hidden Markov Model with Gaussian emissions. |
|
Hidden Markov Model with Gaussian mixture emissions. |
|
Hidden Markov Model with multinomial (discrete) emissions. |
Read on for details on how to implement a HMM with a custom emission probability.
Building HMM and generating samples¶
You can build a HMM instance by passing the parameters described above to the
constructor. Then, you can generate samples from the HMM by calling
sample()
.
>>> import numpy as np
>>> from hmmlearn import hmm
>>> np.random.seed(42)
>>>
>>> model = hmm.GaussianHMM(n_components=3, covariance_type="full")
>>> model.startprob_ = np.array([0.6, 0.3, 0.1])
>>> model.transmat_ = np.array([[0.7, 0.2, 0.1],
... [0.3, 0.5, 0.2],
... [0.3, 0.3, 0.4]])
>>> model.means_ = np.array([[0.0, 0.0], [3.0, -3.0], [5.0, 10.0]])
>>> model.covars_ = np.tile(np.identity(2), (3, 1, 1))
>>> X, Z = model.sample(100)
The transition probability matrix need not to be ergodic. For instance, a left-right HMM can be defined as follows:
>>> lr = hmm.GaussianHMM(n_components=3, covariance_type="diag",
... init_params="cm", params="cmt")
>>> lr.startprob_ = np.array([1.0, 0.0, 0.0])
>>> lr.transmat_ = np.array([[0.5, 0.5, 0.0],
... [0.0, 0.5, 0.5],
... [0.0, 0.0, 1.0]])
If any of the required parameters are missing, sample()
will raise an exception:
>>> model = hmm.GaussianHMM(n_components=3)
>>> X, Z = model.sample(100)
Traceback (most recent call last):
...
sklearn.exceptions.NotFittedError: This GaussianHMM instance is not fitted yet. Call 'fit' with appropriate arguments before using this method.
Fixing parameters¶
Each HMM parameter has a character code which can be used to customize its initialization and estimation. The EM algorithm needs a starting point to proceed, thus prior to training each parameter is assigned a value either random or computed from the data. It is possible to hook into this process and provide a starting point explicitly. To do so
ensure that the character code for the parameter is missing from
init_params
and thenset the parameter to the desired value.
For example, consider a HMM with an explicitly initialized transition probability matrix:
>>> model = hmm.GaussianHMM(n_components=3, n_iter=100, init_params="mcs")
>>> model.transmat_ = np.array([[0.7, 0.2, 0.1],
... [0.3, 0.5, 0.2],
... [0.3, 0.3, 0.4]])
A similar trick applies to parameter estimation. If you want to fix some
parameter at a specific value, remove the corresponding character from
params
and set the parameter value before training.
Examples:
Saving and loading HMM¶
After training, a HMM can be easily persisted for future use with the standard
pickle
module:
>>> import pickle
>>> with open("filename.pkl", "wb") as file: pickle.dump(remodel, file)
>>> with open("filename.pkl", "rb") as file: pickle.load(file)
GaussianHMM(...
Implementing HMMs with custom emission probabilities¶
If you want to implement a custom emission probability (e.g. Poisson), you have to
subclass BaseHMM
and override the following methods
Optionally, only one of ~.BaseHMM._compute_likelihood
and
~.BaseHMM._compute_log_likelihood
need to be overridden, and the
base implementation will provide the other.
Examples¶
Note
Click here to download the full example code
Sampling from HMM¶
This script shows how to sample points from a Hidden Markov Model (HMM): we use a 4-state model with specified mean and covariance.
The plot show the sequence of observations generated with the transitions between them. We can see that, as specified by our transition matrix, there are no transition between component 1 and 3.

import numpy as np
import matplotlib.pyplot as plt
from hmmlearn import hmm
# Prepare parameters for a 4-components HMM
# Initial population probability
startprob = np.array([0.6, 0.3, 0.1, 0.0])
# The transition matrix, note that there are no transitions possible
# between component 1 and 3
transmat = np.array([[0.7, 0.2, 0.0, 0.1],
[0.3, 0.5, 0.2, 0.0],
[0.0, 0.3, 0.5, 0.2],
[0.2, 0.0, 0.2, 0.6]])
# The means of each component
means = np.array([[0.0, 0.0],
[0.0, 11.0],
[9.0, 10.0],
[11.0, -1.0]])
# The covariance of each component
covars = .5 * np.tile(np.identity(2), (4, 1, 1))
# Build an HMM instance and set parameters
model = hmm.GaussianHMM(n_components=4, covariance_type="full")
# Instead of fitting it from the data, we directly set the estimated
# parameters, the means and covariance of the components
model.startprob_ = startprob
model.transmat_ = transmat
model.means_ = means
model.covars_ = covars
# Generate samples
X, Z = model.sample(500)
# Plot the sampled data
plt.plot(X[:, 0], X[:, 1], ".-", label="observations", ms=6,
mfc="orange", alpha=0.7)
# Indicate the component numbers
for i, m in enumerate(means):
plt.text(m[0], m[1], 'Component %i' % (i + 1),
size=17, horizontalalignment='center',
bbox=dict(alpha=.7, facecolor='w'))
plt.legend(loc='best')
plt.show()
Total running time of the script: ( 0 minutes 0.513 seconds)
API Reference¶
This is the class and function reference of hmmlearn
.
Please refer to the full user guide for further details, as the class and function raw specifications may not be enough to give full guidelines on their uses.
hmmlearn.base¶
ConvergenceMonitor¶
- class hmmlearn.base.ConvergenceMonitor(tol, n_iter, verbose)¶
Monitor and report convergence to
sys.stderr
.- Variables
~ConvergenceMonitor.history (deque) – The log probability of the data for the last two training iterations. If the values are not strictly increasing, the model did not converge.
~ConvergenceMonitor.iter (int) – Number of iterations performed while training the model.
Examples
Use custom convergence criteria by subclassing
ConvergenceMonitor
and redefining theconverged
method. The resulting subclass can be used by creating an instance and pointing a model’smonitor_
attribute to it prior to fitting.>>> from hmmlearn.base import ConvergenceMonitor >>> from hmmlearn import hmm >>> >>> class ThresholdMonitor(ConvergenceMonitor): ... @property ... def converged(self): ... return (self.iter == self.n_iter or ... self.history[-1] >= self.tol) >>> >>> model = hmm.GaussianHMM(n_components=2, tol=5, verbose=True) >>> model.monitor_ = ThresholdMonitor(model.monitor_.tol, ... model.monitor_.n_iter, ... model.monitor_.verbose)
- __init__(tol, n_iter, verbose)¶
- Parameters
tol (double) – Convergence threshold. EM has converged either if the maximum number of iterations is reached or the log probability improvement between the two consecutive iterations is less than threshold.
n_iter (int) – Maximum number of iterations to perform.
verbose (bool) – Whether per-iteration convergence reports are printed.
- property converged¶
Whether the EM algorithm converged.
- report(log_prob)¶
Report convergence to
sys.stderr
.The output consists of three columns: iteration number, log probability of the data at the current iteration and convergence rate. At the first iteration convergence rate is unknown and is thus denoted by NaN.
- Parameters
log_prob (float) – The log probability of the data as computed by EM algorithm in the current iteration.
BaseHMM¶
- class hmmlearn.base.BaseHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', init_params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', implementation='log')¶
Base class for Hidden Markov Models.
This class allows for easy evaluation of, sampling from, and maximum a posteriori estimation of the parameters of a HMM.
- Variables
~BaseHMM.monitor_ (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
~BaseHMM.startprob_ (array, shape (n_components, )) – Initial state occupation distribution.
~BaseHMM.transmat_ (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
Notes
Normally, one should use a subclass of
.BaseHMM
, with its specialization towards a given emission model. In rare cases, the base class can also be useful in itself, if one simply wants to generate a sequence of states using.BaseHMM.sample
. In that case, the feature matrix will have zero features.- __init__(n_components=1, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', init_params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', implementation='log')¶
- Parameters
n_components (int) – Number of states in the model.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for
startprob_
.transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities
transmat_
.algorithm ({"viterbi", "map"}, optional) – Decoder algorithm.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to
sys.stderr
. Convergence can also be diagnosed using themonitor_
attribute.params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.init_params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability. However, the scaling implementation is generally faster.
- _accumulate_sufficient_statistics(stats, X, lattice, posteriors, fwdlattice, bwdlattice)¶
Update sufficient statistics from a given sample.
- Parameters
stats (dict) – Sufficient statistics as returned by
_initialize_sufficient_statistics()
.X (array, shape (n_samples, n_features)) – Sample sequence.
lattice (array, shape (n_samples, n_components)) – Probabilities OR Log Probabilities of each sample under each of the model states. Depends on the choice of implementation of the Forward-Backward algorithm
posteriors (array, shape (n_samples, n_components)) – Posterior probabilities of each sample being generated by each of the model states.
fwdlattice (array, shape (n_samples, n_components)) – forward and backward probabilities.
bwdlattice (array, shape (n_samples, n_components)) – forward and backward probabilities.
- _accumulate_sufficient_statistics_log(stats, X, lattice, posteriors, fwdlattice, bwdlattice)¶
Implementation of
_accumulate_sufficient_statistics
forimplementation = "log"
.
- _accumulate_sufficient_statistics_scaling(stats, X, lattice, posteriors, fwdlattice, bwdlattice)¶
Implementation of
_accumulate_sufficient_statistics
forimplementation = "log"
.
- _check()¶
Validate model parameters prior to fitting.
- Raises
ValueError – If any of the parameters are invalid, e.g. if
startprob_
don’t sum to 1.
- _check_sum_1(name)¶
Check that an array describes one or more distributions.
- _compute_likelihood(X)¶
Compute per-component probability under the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
- Returns
log_prob (array, shape (n_samples, n_components)) – Log probability of each sample in
X
for each of the model states.
- _compute_log_likelihood(X)¶
Compute per-component emission log probability under the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
- Returns
log_prob (array, shape (n_samples, n_components)) – Emission log probability of each sample in
X
for each of the model states, i.e.,log(p(X|state))
.
- _do_mstep(stats)¶
Perform the M-step of EM algorithm.
- Parameters
stats (dict) – Sufficient statistics updated from all available samples.
- _generate_sample_from_state(state, random_state=None)¶
Generate a random sample from a given component.
- Parameters
state (int) – Index of the component to condition on.
random_state (RandomState or an int seed) – A random number generator instance. If
None
, the object’srandom_state
is used.
- Returns
X (array, shape (n_features, )) – A random sample from the emission distribution corresponding to a given component.
- _get_n_fit_scalars_per_param()¶
Return a mapping of fittable parameter names (as in
self.params
) to the number of corresponding scalar parameters that will actually be fitted.This is used to detect whether the user did not pass enough data points for a non-degenerate fit.
- _init(X)¶
Initialize model parameters prior to fitting.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
- _initialize_sufficient_statistics()¶
Initialize sufficient statistics required for M-step.
The method is pure, meaning that it doesn’t change the state of the instance. For extensibility computed statistics are stored in a dictionary.
- Returns
nobs (int) – Number of samples in the data.
start (array, shape (n_components, )) – An array where the i-th element corresponds to the posterior probability of the first sample being generated by the i-th state.
trans (array, shape (n_components, n_components)) – An array where the (i, j)-th element corresponds to the posterior probability of transitioning between the i-th to j-th states.
- _score(X, lengths=None, *, compute_posteriors)¶
Helper for
score
andscore_samples
.Compute the log probability under the model, as well as posteriors if compute_posteriors is True (otherwise, an empty array is returned for the latter).
- _score_log(X, lengths=None, *, compute_posteriors)¶
Compute the log probability under the model, as well as posteriors if compute_posteriors is True (otherwise, an empty array is returned for the latter).
- decode(X, lengths=None, algorithm=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.algorithm (string) – Decoder algorithm. Must be one of “viterbi” or “map”. If not given,
decoder
is used.
- Returns
log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
obtained via a given decoderalgorithm
.
See also
score_samples
Compute the log probability under the model and posteriors.
score
Compute the log probability under the model.
- fit(X, lengths=None)¶
Estimate model parameters.
An initialization step is performed before entering the EM algorithm. If you want to avoid this step for a subset of the parameters, pass proper
init_params
keyword argument to estimator’s constructor.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
self (object) – Returns self.
- get_stationary_distribution()¶
Compute the stationary distribution of states.
- predict(X, lengths=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
.
- predict_proba(X, lengths=None)¶
Compute the posterior probability for each state in the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from
X
.
- sample(n_samples=1, random_state=None, currstate=None)¶
Generate random samples from the model.
- Parameters
n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If
None
, the object’srandom_state
is used.currstate (int) – Current state, as the initial state of the samples.
- Returns
X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.
Examples
# generate samples continuously _, Z = model.sample(n_samples=10) X, Z = model.sample(n_samples=10, currstate=Z[-1])
- score(X, lengths=None)¶
Compute the log probability under the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.
See also
score_samples
Compute the log probability under the model and posteriors.
decode
Find most likely state sequence corresponding to
X
.
- score_samples(X, lengths=None)¶
Compute the log probability under the model and compute posteriors.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in
X
.
hmmlearn.hmm¶
GaussianHMM¶
- class hmmlearn.hmm.GaussianHMM(n_components=1, covariance_type='diag', min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, means_prior=0, means_weight=0, covars_prior=0.01, covars_weight=1, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmc', init_params='stmc', implementation='log')¶
Hidden Markov Model with Gaussian emissions.
- Variables
~GaussianHMM.n_features (int) – Dimensionality of the Gaussian emissions.
~GaussianHMM.monitor_ (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
~GaussianHMM.startprob_ (array, shape (n_components, )) – Initial state occupation distribution.
~GaussianHMM.transmat_ (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
~GaussianHMM.means_ (array, shape (n_components, n_features)) – Mean parameters for each state.
~GaussianHMM.covars_ (array) –
Covariance parameters for each state.
The shape depends on
covariance_type
:(n_components, ) if “spherical”,
(n_components, n_features) if “diag”,
(n_components, n_features, n_features) if “full”,
(n_features, n_features) if “tied”.
Examples
>>> from hmmlearn.hmm import GaussianHMM >>> GaussianHMM(n_components=2) GaussianHMM(algorithm='viterbi',...
- __init__(n_components=1, covariance_type='diag', min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, means_prior=0, means_weight=0, covars_prior=0.01, covars_weight=1, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmc', init_params='stmc', implementation='log')¶
- Parameters
n_components (int) – Number of states.
covariance_type ({"sperical", "diag", "full", "tied"}, optional) –
The type of covariance parameters to use:
”spherical” — each state uses a single variance value that applies to all features (default).
”diag” — each state uses a diagonal covariance matrix.
”full” — each state uses a full (i.e. unrestricted) covariance matrix.
”tied” — all states use the same full covariance matrix.
min_covar (float, optional) – Floor on the diagonal of the covariance matrix to prevent overfitting. Defaults to 1e-3.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for
startprob_
.transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities
transmat_
.means_prior (array, shape (n_components, ), optional) – Mean and precision of the Normal prior distribtion for
means_
.means_weight (array, shape (n_components, ), optional) – Mean and precision of the Normal prior distribtion for
means_
.covars_prior (array, shape (n_components, ), optional) –
Parameters of the prior distribution for the covariance matrix
covars_
.If
covariance_type
is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.covars_weight (array, shape (n_components, ), optional) –
Parameters of the prior distribution for the covariance matrix
covars_
.If
covariance_type
is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.algorithm ({"viterbi", "map"}, optional) – Decoder algorithm.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to
sys.stderr
. Convergence can also be diagnosed using themonitor_
attribute.params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars. Defaults to all parameters.init_params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars. Defaults to all parameters.implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.
- decode(X, lengths=None, algorithm=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.algorithm (string) – Decoder algorithm. Must be one of “viterbi” or “map”. If not given,
decoder
is used.
- Returns
log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
obtained via a given decoderalgorithm
.
See also
score_samples
Compute the log probability under the model and posteriors.
score
Compute the log probability under the model.
- fit(X, lengths=None)¶
Estimate model parameters.
An initialization step is performed before entering the EM algorithm. If you want to avoid this step for a subset of the parameters, pass proper
init_params
keyword argument to estimator’s constructor.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
self (object) – Returns self.
- get_stationary_distribution()¶
Compute the stationary distribution of states.
- predict(X, lengths=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
.
- predict_proba(X, lengths=None)¶
Compute the posterior probability for each state in the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from
X
.
- sample(n_samples=1, random_state=None, currstate=None)¶
Generate random samples from the model.
- Parameters
n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If
None
, the object’srandom_state
is used.currstate (int) – Current state, as the initial state of the samples.
- Returns
X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.
Examples
# generate samples continuously _, Z = model.sample(n_samples=10) X, Z = model.sample(n_samples=10, currstate=Z[-1])
- score(X, lengths=None)¶
Compute the log probability under the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.
See also
score_samples
Compute the log probability under the model and posteriors.
decode
Find most likely state sequence corresponding to
X
.
- score_samples(X, lengths=None)¶
Compute the log probability under the model and compute posteriors.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in
X
.
GMMHMM¶
- class hmmlearn.hmm.GMMHMM(n_components=1, n_mix=1, min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, weights_prior=1.0, means_prior=0.0, means_weight=0.0, covars_prior=None, covars_weight=None, algorithm='viterbi', covariance_type='diag', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmcw', init_params='stmcw', implementation='log')¶
Hidden Markov Model with Gaussian mixture emissions.
- Variables
~GMMHMM.monitor_ (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
~GMMHMM.startprob_ (array, shape (n_components, )) – Initial state occupation distribution.
~GMMHMM.transmat_ (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
~GMMHMM.weights_ (array, shape (n_components, n_mix)) – Mixture weights for each state.
~GMMHMM.means_ (array, shape (n_components, n_mix, n_features)) – Mean parameters for each mixture component in each state.
~GMMHMM.covars_ (array) –
Covariance parameters for each mixture components in each state.
The shape depends on
covariance_type
:(n_components, n_mix) if “spherical”,
(n_components, n_mix, n_features) if “diag”,
(n_components, n_mix, n_features, n_features) if “full”
(n_components, n_features, n_features) if “tied”.
- __init__(n_components=1, n_mix=1, min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, weights_prior=1.0, means_prior=0.0, means_weight=0.0, covars_prior=None, covars_weight=None, algorithm='viterbi', covariance_type='diag', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmcw', init_params='stmcw', implementation='log')¶
- Parameters
n_components (int) – Number of states in the model.
n_mix (int) – Number of states in the GMM.
covariance_type ({"sperical", "diag", "full", "tied"}, optional) –
The type of covariance parameters to use:
”spherical” — each state uses a single variance value that applies to all features.
”diag” — each state uses a diagonal covariance matrix (default).
”full” — each state uses a full (i.e. unrestricted) covariance matrix.
”tied” — all mixture components of each state use the same full covariance matrix (note that this is not the same as for
GaussianHMM
).
min_covar (float, optional) – Floor on the diagonal of the covariance matrix to prevent overfitting. Defaults to 1e-3.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for
startprob_
.transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities
transmat_
.weights_prior (array, shape (n_mix, ), optional) – Parameters of the Dirichlet prior distribution for
weights_
.means_prior (array, shape (n_mix, ), optional) – Mean and precision of the Normal prior distribtion for
means_
.means_weight (array, shape (n_mix, ), optional) – Mean and precision of the Normal prior distribtion for
means_
.covars_prior (array, shape (n_mix, ), optional) –
Parameters of the prior distribution for the covariance matrix
covars_
.If
covariance_type
is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.covars_weight (array, shape (n_mix, ), optional) –
Parameters of the prior distribution for the covariance matrix
covars_
.If
covariance_type
is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.algorithm ({"viterbi", "map"}, optional) – Decoder algorithm.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to
sys.stderr
. Convergence can also be diagnosed using themonitor_
attribute.params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, ‘c’ for covars, and ‘w’ for GMM mixing weights. Defaults to all parameters.init_params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, ‘c’ for covars, and ‘w’ for GMM mixing weights. Defaults to all parameters.implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.
- decode(X, lengths=None, algorithm=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.algorithm (string) – Decoder algorithm. Must be one of “viterbi” or “map”. If not given,
decoder
is used.
- Returns
log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
obtained via a given decoderalgorithm
.
See also
score_samples
Compute the log probability under the model and posteriors.
score
Compute the log probability under the model.
- fit(X, lengths=None)¶
Estimate model parameters.
An initialization step is performed before entering the EM algorithm. If you want to avoid this step for a subset of the parameters, pass proper
init_params
keyword argument to estimator’s constructor.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
self (object) – Returns self.
- get_stationary_distribution()¶
Compute the stationary distribution of states.
- predict(X, lengths=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
.
- predict_proba(X, lengths=None)¶
Compute the posterior probability for each state in the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from
X
.
- sample(n_samples=1, random_state=None, currstate=None)¶
Generate random samples from the model.
- Parameters
n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If
None
, the object’srandom_state
is used.currstate (int) – Current state, as the initial state of the samples.
- Returns
X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.
Examples
# generate samples continuously _, Z = model.sample(n_samples=10) X, Z = model.sample(n_samples=10, currstate=Z[-1])
- score(X, lengths=None)¶
Compute the log probability under the model.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.
See also
score_samples
Compute the log probability under the model and posteriors.
decode
Find most likely state sequence corresponding to
X
.
- score_samples(X, lengths=None)¶
Compute the log probability under the model and compute posteriors.
- Parameters
X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in
X
.
MultinomialHMM¶
- class hmmlearn.hmm.MultinomialHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='ste', init_params='ste', implementation='log')¶
Hidden Markov Model with multinomial (discrete) emissions.
- Variables
~MultinomialHMM.n_features (int) – Number of possible symbols emitted by the model (in the samples).
~MultinomialHMM.monitor_ (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
~MultinomialHMM.startprob_ (array, shape (n_components, )) – Initial state occupation distribution.
~MultinomialHMM.transmat_ (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
~MultinomialHMM.emissionprob_ (array, shape (n_components, n_features)) – Probability of emitting a given symbol when in each state.
Examples
>>> from hmmlearn.hmm import MultinomialHMM >>> MultinomialHMM(n_components=2) MultinomialHMM(algorithm='viterbi',...
- __init__(n_components=1, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='ste', init_params='ste', implementation='log')¶
- Parameters
n_components (int) – Number of states.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for
startprob_
.transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities
transmat_
.algorithm ({"viterbi", "map"}, optional) – Decoder algorithm.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to
sys.stderr
. Convergence can also be diagnosed using themonitor_
attribute.params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.init_params (string, optional) – The parameters that get updated during (
params
) or initialized before (init_params
) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.
- decode(X, lengths=None, algorithm=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.algorithm (string) – Decoder algorithm. Must be one of “viterbi” or “map”. If not given,
decoder
is used.
- Returns
log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
obtained via a given decoderalgorithm
.
See also
score_samples
Compute the log probability under the model and posteriors.
score
Compute the log probability under the model.
Notes
Unlike other HMM classes,
MultinomialHMM
X
arrays have shape(n_samples, 1)
(instead of(n_samples, n_features)
). Consider usingsklearn.preprocessing.LabelEncoder
to transform your input to the right format.
- fit(X, lengths=None)¶
Estimate model parameters.
An initialization step is performed before entering the EM algorithm. If you want to avoid this step for a subset of the parameters, pass proper
init_params
keyword argument to estimator’s constructor.- Parameters
X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
self (object) – Returns self.
Notes
Unlike other HMM classes,
MultinomialHMM
X
arrays have shape(n_samples, 1)
(instead of(n_samples, n_features)
). Consider usingsklearn.preprocessing.LabelEncoder
to transform your input to the right format.
- get_stationary_distribution()¶
Compute the stationary distribution of states.
- predict(X, lengths=None)¶
Find most likely state sequence corresponding to
X
.- Parameters
X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
state_sequence (array, shape (n_samples, )) – Labels for each sample from
X
.
Notes
Unlike other HMM classes,
MultinomialHMM
X
arrays have shape(n_samples, 1)
(instead of(n_samples, n_features)
). Consider usingsklearn.preprocessing.LabelEncoder
to transform your input to the right format.
- predict_proba(X, lengths=None)¶
Compute the posterior probability for each state in the model.
- Parameters
X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from
X
.
Notes
Unlike other HMM classes,
MultinomialHMM
X
arrays have shape(n_samples, 1)
(instead of(n_samples, n_features)
). Consider usingsklearn.preprocessing.LabelEncoder
to transform your input to the right format.
- sample(n_samples=1, random_state=None, currstate=None)¶
Generate random samples from the model.
- Parameters
n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If
None
, the object’srandom_state
is used.currstate (int) – Current state, as the initial state of the samples.
- Returns
X (array, shape (n_samples, 1)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.
Examples
# generate samples continuously _, Z = model.sample(n_samples=10) X, Z = model.sample(n_samples=10, currstate=Z[-1])
Notes
Unlike other HMM classes,
MultinomialHMM
X
arrays have shape(n_samples, 1)
(instead of(n_samples, n_features)
). Consider usingsklearn.preprocessing.LabelEncoder
to transform your input to the right format.
- score(X, lengths=None)¶
Compute the log probability under the model.
- Parameters
X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.
See also
score_samples
Compute the log probability under the model and posteriors.
decode
Find most likely state sequence corresponding to
X
.
Notes
Unlike other HMM classes,
MultinomialHMM
X
arrays have shape(n_samples, 1)
(instead of(n_samples, n_features)
). Consider usingsklearn.preprocessing.LabelEncoder
to transform your input to the right format.
- score_samples(X, lengths=None)¶
Compute the log probability under the model and compute posteriors.
- Parameters
X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in
X
. The sum of these should ben_samples
.
- Returns
log_prob (float) – Log likelihood of
X
.posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in
X
.
See also
Notes
Unlike other HMM classes,
MultinomialHMM
X
arrays have shape(n_samples, 1)
(instead of(n_samples, n_features)
). Consider usingsklearn.preprocessing.LabelEncoder
to transform your input to the right format.
hmmlearn Changelog¶
Here you can see the full list of changes between each hmmlearn release.
Version 0.2.7¶
Released on February 10th, 2022.
Dropped support for Py3.5 (due to the absence of manylinux wheel supporting both Py3.5 and Py3.10).
_BaseHMM
has been promoted to public API and has been renamed toBaseHMM
.MultinomialHMM no longer overwrites preset
n_features
.An implementation of the Forward-Backward algorithm based upon scaling is available by specifying
implementation="scaling"
when instantiating HMMs. In general, the scaling algorithm is more efficient than an implementation based upon logarithms. Seescripts/benchmark.py
for a comparison of the performance of the two implementations.The logprob parameter to
.ConvergenceMonitor.report
has been renamed to log_prob.
Version 0.2.6¶
Released on July 18th, 2021.
Fixed support for multi-sequence GMM-HMM fit.
Deprecated
utils.iter_from_X_lengths
.Previously, APIs taking a lengths parameter would silently drop the last samples if the total length was less than the number of samples. This behavior is deprecated and will raise an exception in the future.
Version 0.2.5¶
Released on February 3rd, 2021.
Fixed typo in implementation of covariance maximization for GMMHMM.
Changed history of ConvergenceMonitor to include the whole history for evaluation purposes. It can no longer be assumed that it has a maximum length of two.
Version 0.2.4¶
Released on September 12th, 2020.
Warning
GMMHMM covariance maximization was incorrect in this release. This bug was fixed in the following release.
Bumped previously incorrect dependency bound on scipy to 0.19.
Bug fix for ‘params’ argument usage in GMMHMM.
Warn when an explicitly set attribute would be overridden by
init_params_
.
Version 0.2.3¶
Released on December 17th, 2019.
Fitting of degenerate GMMHMMs appears to fail in certain cases on macOS; help with troubleshooting would be welcome.
Dropped support for Py2.7, Py3.4.
Log warning if not enough data is passed to fit() for a meaningful fit.
Better handle degenerate fits.
Allow missing observations in input multinomial data.
Avoid repeatedly rechecking validity of Gaussian covariance matrices.
Version 0.2.2¶
Released on May 5th, 2019.
This version was cut in particular in order to clear up the confusion between the “real” v0.2.1 and the pseudo-0.2.1 that were previously released by various third-party packagers.
Custom ConvergenceMonitors subclasses can be used (#218).
MultinomialHMM now accepts unsigned symbols (#258).
The
get_stationary_distribution
returns the stationary distribution of the transition matrix (i.e., the rescaled left-eigenvector of the transition matrix that is associated with the eigenvalue 1) (#141).
Version 0.2.1¶
Released on October 17th, 2018.
GMMHMM was fully rewritten (#107).
Fixed underflow when dealing with logs. Thanks to @aubreyli. See PR #105 on GitHub.
Reduced worst-case memory consumption of the M-step from O(S^2 T) to O(S T). See issue #313 on GitHub.
Dropped support for Python 2.6. It is no longer supported by scikit-learn.
Version 0.2.0¶
Released on March 1st, 2016.
The release contains a known bug: fitting GMMHMM
with covariance
types other than "diag"
does not work. This is going to be fixed
in the following version. See issue #78 on GitHub for details.
Removed deprecated re-exports from
hmmlean.hmm
.Speed up forward-backward algorithms and Viterbi decoding by using Cython typed memoryviews. Thanks to @cfarrow. See PR#82 on GitHub.
Changed the API to accept multiple sequences via a single feature matrix
X
and an array of sequencelengths
. This allowed to use the HMMs as part of scikit-learnPipeline
. The idea was shamelessly plugged fromseqlearn
package by @larsmans. See issue #29 on GitHub.Removed
params
andinit_params
from internal methods. Accepting these as arguments was redundant and confusing, because both available as instance attributes.Implemented
ConvergenceMonitor
, a class for convergence diagnostics. The idea is due to @mvictor212.Added support for non-fully connected architectures, e.g. left-right HMMs. Thanks to @matthiasplappert. See issue #33 and PR #38 on GitHub.
Fixed normalization of emission probabilities in
MultinomialHMM
, see issue #19 on GitHub.GaussianHMM
is now initialized from all observations, see issue #1 on GitHub.Changed the models to do input validation lazily as suggested by the scikit-learn guidelines.
Added
min_covar
parameter for controlling overfitting ofGaussianHMM
, see issue #2 on GitHub.Accelerated M-step fro
GaussianHMM
with full and tied covariances. See PR #97 on GitHub. Thanks to @anntzer.Fixed M-step for
GMMHMM
, which incorrectly expectedGMM.score_samples
to return log-probabilities. See PR #4 on GitHub for discussion. Thanks to @mvictor212 and @michcio1234.
Version 0.1.1¶
Initial release, released on February 9th 2015.