API Reference#

This is the class and function reference of hmmlearn.

Please refer to the full user guide for further details, as the class and function raw specifications may not be enough to give full guidelines on their uses.

hmmlearn.base#

ConvergenceMonitor#

class hmmlearn.base.ConvergenceMonitor(tol, n_iter, verbose)#

Monitor and report convergence to sys.stderr.

Variables:

history (deque) – The log probability of the data for the last two training iterations. If the values are not strictly increasing, the model did not converge.
iter (int) – Number of iterations performed while training the model.

Examples

Use custom convergence criteria by subclassing ConvergenceMonitor and redefining the converged method. The resulting subclass can be used by creating an instance and pointing a model’s monitor_ attribute to it prior to fitting.

>>> from hmmlearn.base import ConvergenceMonitor
>>> from hmmlearn import hmm
>>>
>>> class ThresholdMonitor(ConvergenceMonitor):
...     @property
...     def converged(self):
...         return (self.iter == self.n_iter or
...                 self.history[-1] >= self.tol)
>>>
>>> model = hmm.GaussianHMM(n_components=2, tol=5, verbose=True)
>>> model.monitor_ = ThresholdMonitor(model.monitor_.tol,
...                                   model.monitor_.n_iter,
...                                   model.monitor_.verbose)

__init__(tol, n_iter, verbose)#

Parameters:

tol (double) – Convergence threshold. EM has converged either if the maximum number of iterations is reached or the log probability improvement between the two consecutive iterations is less than threshold.
n_iter (int) – Maximum number of iterations to perform.
verbose (bool) – Whether per-iteration convergence reports are printed.

property converged#: Whether the EM algorithm converged.

report(log_prob)#

Report convergence to sys.stderr.

The output consists of three columns: iteration number, log probability of the data at the current iteration and convergence rate. At the first iteration convergence rate is unknown and is thus denoted by NaN.

Parameters:: log_prob (float) – The log probability of the data as computed by EM algorithm in the current iteration.

_AbstractHMM#

class hmmlearn.base._AbstractHMM(n_components, algorithm, random_state, n_iter, tol, verbose, params, init_params, implementation)#

Base class for Hidden Markov Models learned via Expectation-Maximization and Variational Bayes.

__init__(n_components, algorithm, random_state, n_iter, tol, verbose, params, init_params, implementation)#

Parameters:

n_components (int) – Number of states in the model.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability. However, the scaling implementation is generally faster.

_accumulate_sufficient_statistics(stats, X, lattice, posteriors, fwdlattice, bwdlattice)#

Update sufficient statistics from a given sample.

Parameters:

stats (dict) – Sufficient statistics as returned by _initialize_sufficient_statistics().
X (array, shape (n_samples, n_features)) – Sample sequence.
lattice (array, shape (n_samples, n_components)) – Probabilities OR Log Probabilities of each sample under each of the model states. Depends on the choice of implementation of the Forward-Backward algorithm
posteriors (array, shape (n_samples, n_components)) – Posterior probabilities of each sample being generated by each of the model states.
fwdlattice (array, shape (n_samples, n_components)) – forward and backward probabilities.
bwdlattice (array, shape (n_samples, n_components)) – forward and backward probabilities.

_accumulate_sufficient_statistics_log(stats, X, lattice, posteriors, fwdlattice, bwdlattice)#: Implementation of _accumulate_sufficient_statistics for implementation = "log".

_accumulate_sufficient_statistics_scaling(stats, X, lattice, posteriors, fwdlattice, bwdlattice)#: Implementation of _accumulate_sufficient_statistics for implementation = "log".

_check()#

Validate model parameters prior to fitting.

Raises:: ValueError – If any of the parameters are invalid, e.g. if startprob_ don’t sum to 1.

_check_sum_1(name)#: Check that an array describes one or more distributions.

_compute_likelihood(X)#

Compute per-component probability under the model.

Parameters:: X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
Returns:: log_prob (array, shape (n_samples, n_components)) – Log probability of each sample in X for each of the model states.

_compute_log_likelihood(X)#

Compute per-component emission log probability under the model.

Parameters:: X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
Returns:: log_prob (array, shape (n_samples, n_components)) – Emission log probability of each sample in X for each of the model states, i.e., log(p(X|state)).

_do_mstep(stats)#

Perform the M-step of EM algorithm.

Parameters:: stats (dict) – Sufficient statistics updated from all available samples.

_generate_sample_from_state(state, random_state)#

Generate a random sample from a given component.

Parameters:

state (int) – Index of the component to condition on.
random_state (RandomState) – A random number generator instance. (sample is the only caller for this method and already normalizes random_state.)

Returns:

X (array, shape (n_features, )) – A random sample from the emission distribution corresponding to a given component.

_get_n_fit_scalars_per_param()#

Return a mapping of fittable parameter names (as in self.params) to the number of corresponding scalar parameters that will actually be fitted.

This is used to detect whether the user did not pass enough data points for a non-degenerate fit.

_initialize_sufficient_statistics()#

Initialize sufficient statistics required for M-step.

The method is pure, meaning that it doesn’t change the state of the instance. For extensibility computed statistics are stored in a dictionary.

Returns:

nobs (int) – Number of samples in the data.
start (array, shape (n_components, )) – An array where the i-th element corresponds to the posterior probability of the first sample being generated by the i-th state.
trans (array, shape (n_components, n_components)) – An array where the (i, j)-th element corresponds to the posterior probability of transitioning between the i-th to j-th states.

_score(X, lengths=None, *, compute_posteriors)#

Helper for score and score_samples.

Compute the log probability under the model, as well as posteriors if compute_posteriors is True (otherwise, an empty array is returned for the latter).

_score_log(X, lengths=None, *, compute_posteriors)#: Compute the log probability under the model, as well as posteriors if compute_posteriors is True (otherwise, an empty array is returned for the latter).

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

fit(X, lengths=None)#

Estimate model parameters.

An initialization step is performed before entering the EM algorithm. If you want to avoid this step for a subset of the parameters, pass proper init_params keyword argument to estimator’s constructor.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → _AbstractHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → _AbstractHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → _AbstractHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → _AbstractHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

BaseHMM#

class hmmlearn.base.BaseHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', init_params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', implementation='log')#

Base class for Hidden Markov Models learned from Expectation-Maximization.

This class allows for easy evaluation of, sampling from, and maximum a posteriori estimation of the parameters of a HMM.

Variables:

monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob (array, shape (n_components, )) – Initial state occupation distribution.
transmat (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.

Notes

Normally, one should use a subclass of BaseHMM, with its specialization towards a given emission model. In rare cases, the base class can also be useful in itself, if one simply wants to generate a sequence of states using BaseHMM.sample. In that case, the feature matrix will have zero features.

__init__(n_components=1, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', init_params='abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', implementation='log')#

Parameters:

n_components (int) – Number of states in the model.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability. However, the scaling implementation is generally faster.

_check()#

Validate model parameters prior to fitting.

Raises:: ValueError – If any of the parameters are invalid, e.g. if startprob_ don’t sum to 1.

_check_sum_1(name)#: Check that an array describes one or more distributions.

_do_mstep(stats)#

Perform the M-step of EM algorithm.

Parameters:: stats (dict) – Sufficient statistics updated from all available samples.

_init(X, lengths=None)#

Initialize model parameters prior to fitting.

Parameters:: X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.

aic(X, lengths=None)#

Akaike information criterion for the current model on the input X.

AIC = -2*logLike + 2 * num_free_params

https://en.wikipedia.org/wiki/Akaike_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

aic (float) – The lower the better.

bic(X, lengths=None)#

Bayesian information criterion for the current model on the input X.

BIC = -2*logLike + num_free_params * log(num_of_data)

https://en.wikipedia.org/wiki/Bayesian_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

bic (float) – The lower the better.

get_stationary_distribution()#: Compute the stationary distribution of states.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → BaseHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → BaseHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → BaseHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → BaseHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

VariationalBaseHMM#

class hmmlearn.base.VariationalBaseHMM(n_components=1, startprob_prior=None, transmat_prior=None, algorithm='viterbi', random_state=None, n_iter=100, tol=1e-06, verbose=False, params='ste', init_params='ste', implementation='log')#

__init__(n_components=1, startprob_prior=None, transmat_prior=None, algorithm='viterbi', random_state=None, n_iter=100, tol=1e-06, verbose=False, params='ste', init_params='ste', implementation='log')#

Parameters:

n_components (int) – Number of states in the model.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and other characters for subclass-specific emission parameters. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability. However, the scaling implementation is generally faster.

_accumulate_sufficient_statistics_log(stats, X, lattice, posteriors, fwdlattice, bwdlattice)#: Implementation of _accumulate_sufficient_statistics for implementation = "log".

_accumulate_sufficient_statistics_scaling(stats, X, lattice, posteriors, fwdlattice, bwdlattice)#: Implementation of _accumulate_sufficient_statistics for implementation = "log".

_check()#

Validate model parameters prior to fitting.

Raises:: ValueError – If any of the parameters are invalid, e.g. if startprob_ don’t sum to 1.

_compute_lower_bound(curr_logprob)#

Compute the Variational Lower Bound of the model as currently configured.

Following the pattern elsewhere, derived implementations should call this method to get the contribution of the current log_prob, transmat, and startprob towards the lower bound

Parameters:: curr_logprob (float) – The current log probability of the data as computed at the subnormalized model parameters.
Returns:: lower_bound (float) – Returns the computed lower bound contribution of the log_prob, startprob, and transmat.

_do_mstep(stats)#

Perform the M-step of EM algorithm.

Parameters:: stats (dict) – Sufficient statistics updated from all available samples.

_estep_begin()#: Update the subnormalized model parameters. Called at the beginning of each iteration of fit()

_init(X, lengths=None)#

Initialize model parameters prior to fitting.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalBaseHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalBaseHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalBaseHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalBaseHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

hmmlearn.hmm#

GaussianHMM#

class hmmlearn.hmm.GaussianHMM(n_components=1, covariance_type='diag', min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, means_prior=0, means_weight=0, covars_prior=0.01, covars_weight=1, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmc', init_params='stmc', implementation='log')#

Hidden Markov Model with Gaussian emissions.

Variables:

n_features (int) – Dimensionality of the Gaussian emissions.
monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob (array, shape (n_components, )) – Initial state occupation distribution.
transmat (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
means (array, shape (n_components, n_features)) – Mean parameters for each state.
covars (array) –
Covariance parameters for each state.

The shape depends on covariance_type:
- (n_components, ) if “spherical”,
- (n_components, n_features) if “diag”,
- (n_components, n_features, n_features) if “full”,
- (n_features, n_features) if “tied”.

Examples

>>> from hmmlearn.hmm import GaussianHMM
>>> GaussianHMM(n_components=2)  
GaussianHMM(algorithm='viterbi',...

__init__(n_components=1, covariance_type='diag', min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, means_prior=0, means_weight=0, covars_prior=0.01, covars_weight=1, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmc', init_params='stmc', implementation='log')#

Parameters:

n_components (int) – Number of states.
covariance_type ({"spherical", "diag", "full", "tied"}, optional) –
The type of covariance parameters to use:
- ”spherical” — each state uses a single variance value that applies to all features (default).
- ”diag” — each state uses a diagonal covariance matrix.
- ”full” — each state uses a full (i.e. unrestricted) covariance matrix.
- ”tied” — all states use the same full covariance matrix.
min_covar (float, optional) – Floor on the diagonal of the covariance matrix to prevent overfitting. Defaults to 1e-3.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
means_prior (array, shape (n_components, ), optional) – Mean and precision of the Normal prior distribtion for means_.
means_weight (array, shape (n_components, ), optional) – Mean and precision of the Normal prior distribtion for means_.
covars_prior (array, shape (n_components, ), optional) –
Parameters of the prior distribution for the covariance matrix covars_.

If covariance_type is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.
covars_weight (array, shape (n_components, ), optional) –
Parameters of the prior distribution for the covariance matrix covars_.

If covariance_type is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.

aic(X, lengths=None)#

Akaike information criterion for the current model on the input X.

AIC = -2*logLike + 2 * num_free_params

https://en.wikipedia.org/wiki/Akaike_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

aic (float) – The lower the better.

bic(X, lengths=None)#

Bayesian information criterion for the current model on the input X.

BIC = -2*logLike + num_free_params * log(num_of_data)

https://en.wikipedia.org/wiki/Bayesian_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

bic (float) – The lower the better.

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

fit(X, lengths=None)#

Estimate model parameters.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_stationary_distribution()#: Compute the stationary distribution of states.

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → GaussianHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → GaussianHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → GaussianHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → GaussianHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

GMMHMM#

class hmmlearn.hmm.GMMHMM(n_components=1, n_mix=1, min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, weights_prior=1.0, means_prior=0.0, means_weight=0.0, covars_prior=None, covars_weight=None, algorithm='viterbi', covariance_type='diag', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmcw', init_params='stmcw', implementation='log')#

Hidden Markov Model with Gaussian mixture emissions.

Variables:

monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob (array, shape (n_components, )) – Initial state occupation distribution.
transmat (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
weights (array, shape (n_components, n_mix)) – Mixture weights for each state.
means (array, shape (n_components, n_mix, n_features)) – Mean parameters for each mixture component in each state.
covars (array) –
Covariance parameters for each mixture components in each state.

The shape depends on covariance_type:
- (n_components, n_mix) if “spherical”,
- (n_components, n_mix, n_features) if “diag”,
- (n_components, n_mix, n_features, n_features) if “full”
- (n_components, n_features, n_features) if “tied”.

__init__(n_components=1, n_mix=1, min_covar=0.001, startprob_prior=1.0, transmat_prior=1.0, weights_prior=1.0, means_prior=0.0, means_weight=0.0, covars_prior=None, covars_weight=None, algorithm='viterbi', covariance_type='diag', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stmcw', init_params='stmcw', implementation='log')#

Parameters:

n_components (int) – Number of states in the model.
n_mix (int) – Number of states in the GMM.
covariance_type ({"sperical", "diag", "full", "tied"}, optional) –
The type of covariance parameters to use:
- ”spherical” — each state uses a single variance value that applies to all features.
- ”diag” — each state uses a diagonal covariance matrix (default).
- ”full” — each state uses a full (i.e. unrestricted) covariance matrix.
- ”tied” — all mixture components of each state use the same full covariance matrix (note that this is not the same as for GaussianHMM).
min_covar (float, optional) – Floor on the diagonal of the covariance matrix to prevent overfitting. Defaults to 1e-3.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
weights_prior (array, shape (n_mix, ), optional) – Parameters of the Dirichlet prior distribution for weights_.
means_prior (array, shape (n_mix, ), optional) – Mean and precision of the Normal prior distribtion for means_.
means_weight (array, shape (n_mix, ), optional) – Mean and precision of the Normal prior distribtion for means_.
covars_prior (array, shape (n_mix, ), optional) –
Parameters of the prior distribution for the covariance matrix covars_.

If covariance_type is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.
covars_weight (array, shape (n_mix, ), optional) –
Parameters of the prior distribution for the covariance matrix covars_.

If covariance_type is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, ‘c’ for covars, and ‘w’ for GMM mixing weights. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, ‘c’ for covars, and ‘w’ for GMM mixing weights. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.

aic(X, lengths=None)#

Akaike information criterion for the current model on the input X.

AIC = -2*logLike + 2 * num_free_params

https://en.wikipedia.org/wiki/Akaike_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

aic (float) – The lower the better.

bic(X, lengths=None)#

Bayesian information criterion for the current model on the input X.

BIC = -2*logLike + num_free_params * log(num_of_data)

https://en.wikipedia.org/wiki/Bayesian_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

bic (float) – The lower the better.

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

fit(X, lengths=None)#

Estimate model parameters.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_stationary_distribution()#: Compute the stationary distribution of states.

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → GMMHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → GMMHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → GMMHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → GMMHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

MultinomialHMM#

class hmmlearn.hmm.MultinomialHMM(n_components=1, n_trials=None, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='ste', init_params='ste', implementation='log')#

Hidden Markov Model with multinomial emissions.

Variables:

n_features (int) – Number of possible symbols emitted by the model (in the samples).
monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob (array, shape (n_components, )) – Initial state occupation distribution.
transmat (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
emissionprob (array, shape (n_components, n_features)) – Probability of emitting a given symbol when in each state.

Examples

>>> from hmmlearn.hmm import MultinomialHMM

__init__(n_components=1, n_trials=None, startprob_prior=1.0, transmat_prior=1.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='ste', init_params='ste', implementation='log')#

Parameters:

n_components (int) – Number of states.
n_trials (int or array of int) – Number of trials (when sampling, all samples must have the same n_trials).
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.

aic(X, lengths=None)#

Akaike information criterion for the current model on the input X.

AIC = -2*logLike + 2 * num_free_params

https://en.wikipedia.org/wiki/Akaike_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

aic (float) – The lower the better.

bic(X, lengths=None)#

Bayesian information criterion for the current model on the input X.

BIC = -2*logLike + num_free_params * log(num_of_data)

https://en.wikipedia.org/wiki/Bayesian_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

bic (float) – The lower the better.

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

fit(X, lengths=None)#

Estimate model parameters.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_stationary_distribution()#: Compute the stationary distribution of states.

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → MultinomialHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → MultinomialHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → MultinomialHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → MultinomialHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

CategoricalHMM#

class hmmlearn.hmm.CategoricalHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0, *, emissionprob_prior=1.0, n_features=None, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='ste', init_params='ste', implementation='log')#

Hidden Markov Model with categorical (discrete) emissions.

Variables:

n_features (int) – Number of possible symbols emitted by the model (in the samples).
monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob (array, shape (n_components, )) – Initial state occupation distribution.
transmat (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
emissionprob (array, shape (n_components, n_features)) – Probability of emitting a given symbol when in each state.

Examples

>>> from hmmlearn.hmm import CategoricalHMM
>>> CategoricalHMM(n_components=2)  
CategoricalHMM(algorithm='viterbi',...

__init__(n_components=1, startprob_prior=1.0, transmat_prior=1.0, *, emissionprob_prior=1.0, n_features=None, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='ste', init_params='ste', implementation='log')#

Parameters:

n_components (int) – Number of states.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
emissionprob_prior (array, shape (n_components, n_features), optional) – Parameters of the Dirichlet prior distribution for emissionprob_.
n_features (int, optional) – The number of categorical symbols in the HMM. Will be inferred from the data if not set.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.

aic(X, lengths=None)#

Akaike information criterion for the current model on the input X.

AIC = -2*logLike + 2 * num_free_params

https://en.wikipedia.org/wiki/Akaike_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

aic (float) – The lower the better.

bic(X, lengths=None)#

Bayesian information criterion for the current model on the input X.

BIC = -2*logLike + num_free_params * log(num_of_data)

https://en.wikipedia.org/wiki/Bayesian_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

bic (float) – The lower the better.

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

Notes

Unlike other HMM classes, CategoricalHMM X arrays have shape (n_samples, 1) (instead of (n_samples, n_features)). Consider using sklearn.preprocessing.LabelEncoder to transform your input to the right format.

fit(X, lengths=None)#

Estimate model parameters.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

Notes

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_stationary_distribution()#: Compute the stationary distribution of states.

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

Notes

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

Notes

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, 1)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

Notes

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

Notes

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

Notes

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → CategoricalHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → CategoricalHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → CategoricalHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → CategoricalHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

PoissonHMM#

class hmmlearn.hmm.PoissonHMM(n_components=1, startprob_prior=1.0, transmat_prior=1.0, lambdas_prior=0.0, lambdas_weight=0.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stl', init_params='stl', implementation='log')#

Hidden Markov Model with Poisson emissions.

Variables:

monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob (array, shape (n_components, )) – Initial state occupation distribution.
transmat (array, shape (n_components, n_components)) – Matrix of transition probabilities between states.
lambdas (array, shape (n_components, n_features)) – The expectation value of the waiting time parameters for each feature in a given state.

__init__(n_components=1, startprob_prior=1.0, transmat_prior=1.0, lambdas_prior=0.0, lambdas_weight=0.0, algorithm='viterbi', random_state=None, n_iter=10, tol=0.01, verbose=False, params='stl', init_params='stl', implementation='log')#

Parameters:

n_components (int) – Number of states.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
lambdas_prior (array, shape (n_components,), optional) – The gamma prior on the lambda values using alpha-beta notation, respectivley. If None, will be set based on the method of moments.
lambdas_weight (array, shape (n_components,), optional) – The gamma prior on the lambda values using alpha-beta notation, respectivley. If None, will be set based on the method of moments.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘l’ for lambdas. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘l’ for lambdas. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.

aic(X, lengths=None)#

Akaike information criterion for the current model on the input X.

AIC = -2*logLike + 2 * num_free_params

https://en.wikipedia.org/wiki/Akaike_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

aic (float) – The lower the better.

bic(X, lengths=None)#

Bayesian information criterion for the current model on the input X.

BIC = -2*logLike + num_free_params * log(num_of_data)

https://en.wikipedia.org/wiki/Bayesian_information_criterion

Parameters:

X (array of shape (n_samples, n_dimensions)) – The input samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

bic (float) – The lower the better.

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

fit(X, lengths=None)#

Estimate model parameters.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

get_stationary_distribution()#: Compute the stationary distribution of states.

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → PoissonHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → PoissonHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → PoissonHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → PoissonHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

hmmlearn.vhmm#

VariationalCategoricalHMM#

class hmmlearn.vhmm.VariationalCategoricalHMM(n_components=1, startprob_prior=None, transmat_prior=None, emissionprob_prior=None, n_features=None, algorithm='viterbi', random_state=None, n_iter=100, tol=1e-06, verbose=False, params='ste', init_params='ste', implementation='log')#

Hidden Markov Model with categorical (discrete) emissions trained using Variational Inference.

References

https://cse.buffalo.edu/faculty/mbeal/thesis/

Variables:

n_features (int) – Number of possible symbols emitted by the model (in the samples).
monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob_prior (array, shape (n_components, )) – Prior for the initial state occupation distribution.
startprob_posterior (array, shape (n_components, )) – Posterior estimate of the state occupation distribution.
transmat_prior (array, shape (n_components, n_components)) – Prior for the matrix of transition probabilities between states.
transmat_posterior (array, shape (n_components, n_components)) – Posterior estimate of the transition probabilities between states.
emissionprob_prior (array, shape (n_components, n_features)) – Prior estimatate of emitting a given symbol when in each state.
emissionprob_posterior (array, shape (n_components, n_features)) – Posterior estimate of emitting a given symbol when in each state.

Examples

>>> from hmmlearn.hmm import VariationalCategoricalHMM
>>> VariationalCategoricalHMM(n_components=2)  
VariationalCategoricalHMM(algorithm='viterbi',...

__init__(n_components=1, startprob_prior=None, transmat_prior=None, emissionprob_prior=None, n_features=None, algorithm='viterbi', random_state=None, n_iter=100, tol=1e-06, verbose=False, params='ste', init_params='ste', implementation='log')#

Parameters:

n_components (int) – Number of states.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
emissionprob_prior (array, shape (n_components, n_features), optional) – Parameters of the Dirichlet prior distribution for emissionprob_.
n_features (int, optional) – The number of categorical symbols in the HMM. Will be inferred from the data if not set.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, and ‘e’ for emissionprob. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

Notes

fit(X, lengths=None)#

Estimate model parameters.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

Notes

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

Notes

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

Notes

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, 1)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

Notes

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

Notes

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, 1)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

Notes

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalCategoricalHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalCategoricalHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalCategoricalHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalCategoricalHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.

VariationalGaussianHMM#

class hmmlearn.vhmm.VariationalGaussianHMM(n_components=1, covariance_type='full', startprob_prior=None, transmat_prior=None, means_prior=None, beta_prior=None, dof_prior=None, scale_prior=None, algorithm='viterbi', random_state=None, n_iter=100, tol=1e-06, verbose=False, params='stmc', init_params='stmc', implementation='log')#

Hidden Markov Model with Multivariate Gaussian Emissions trained using Variational Inference.

References

Variables:

n_features (int) – Dimensionality of the Gaussian emissions.
monitor (ConvergenceMonitor) – Monitor object used to check the convergence of EM.
startprob_prior (array, shape (n_components, )) – Prior for the initial state occupation distribution.
startprob_posterior (array, shape (n_components, )) – Posterior estimate of the state occupation distribution.
transmat_prior (array, shape (n_components, n_components)) – Prior for the matrix of transition probabilities between states.
transmat_posterior (array, shape (n_components, n_components)) – Posterior estimate of the transition probabilities between states.
means_prior (array, shape (n_components, n_features)) – Prior estimates for the mean of each state.
means_posterior (array, shape (n_components, n_features)) – Posterior estimates for the mean of each state.
beta_prior (array, shape (n_components, )) – Prior estimate on the scale of the variance over the means.
beta_posterior (array, shape (n_components, )) – Posterior estimate of the scale of the variance over the means.
covars (array) –
Covariance parameters for each state.

The shape depends on covariance_type:
- (n_components, ) if “spherical”,
- (n_components, n_features) if “diag”,
- (n_components, n_features, n_features) if “full”,
- (n_features, n_features) if “tied”.
dof_prior (int / array) –
The Degrees Of Freedom prior for each state’s Wishart distribution. The type depends on covariance_type:
- array, shape (n_components, ) if “full”,
- int if “tied”.
dof_prior –
The Prior on the Degrees Of Freedom for each state’s Wishart distribution. The type depends on covariance_type:
- array, shape (n_components, ) if “full”,
- int if “tied”.
dof_posterior (int / array) –
The Degrees Of Freedom for each state’s Wishart distribution. The type depends on covariance_type:
- array, shape (n_components, ) if “full”,
- int if “tied”.
scale_prior (array) –
Prior for the Inverse scale parameter for each state’s Wishart distribution. The wishart distribution is the conjugate prior for the covariance.

The shape depends on covariance_type:
- (n_components, ) if “spherical”,
- (n_components, n_features) if “diag”,
- (n_components, n_features, n_features) if “full”,
- (n_features, n_features) if “tied”.
scale_posterior (array) –
Inverse scale parameter for each state’s wishart distribution. The wishart distribution is the conjugate prior for the covariance.

The shape depends on covariance_type:
- (n_components, ) if “spherical”,
- (n_components, n_features) if “diag”,
- (n_components, n_features, n_features) if “full”,
- (n_features, n_features) if “tied”.

Examples

>>> from hmmlearn.hmm import VariationalGaussianHMM
>>> VariationalGaussianHMM(n_components=2)  
VariationalGaussianHMM(algorithm='viterbi',...

__init__(n_components=1, covariance_type='full', startprob_prior=None, transmat_prior=None, means_prior=None, beta_prior=None, dof_prior=None, scale_prior=None, algorithm='viterbi', random_state=None, n_iter=100, tol=1e-06, verbose=False, params='stmc', init_params='stmc', implementation='log')#

Parameters:

n_components (int) – Number of states.
covariance_type ({"spherical", "diag", "full", "tied"}, optional) –
The type of covariance parameters to use:
- ”spherical” — each state uses a single variance value that applies to all features (default).
- ”diag” — each state uses a diagonal covariance matrix.
- ”full” — each state uses a full (i.e. unrestricted) covariance matrix.
- ”tied” — all states use the same full covariance matrix.
startprob_prior (array, shape (n_components, ), optional) – Parameters of the Dirichlet prior distribution for startprob_.
transmat_prior (array, shape (n_components, n_components), optional) – Parameters of the Dirichlet prior distribution for each row of the transition probabilities transmat_.
means_prior (array, shape (n_components, ), optional) – Mean and precision of the Normal prior distribtion for means_.
beta_prior (array, shape (n_components, ), optional) – Mean and precision of the Normal prior distribtion for means_.
scale_prior (array, optional) –
Parameters of the prior distribution for the covariance matrix covars_.

If covariance_type is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.

The shape of the scale_prior array depends on covariance_type:
- (n_components, ) if “spherical”,
- (n_components, n_features) if “diag”,
- (n_components, n_features, n_features) if “full”,
- (n_features, n_features) if “tied”.
dof_prior (array, optional) –
Parameters of the prior distribution for the covariance matrix covars_.

If covariance_type is “spherical” or “diag” the prior is the inverse gamma distribution, otherwise — the inverse Wishart distribution.

The shape of the scale_prior array depends on covariance_type:
- (n_components, ) if “spherical”,
- (n_components, n_features) if “diag”,
- (n_components, n_features, n_features) if “full”,
- (n_features, n_features) if “tied”.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
random_state (RandomState or an int seed, optional) – A random number generator instance.
n_iter (int, optional) – Maximum number of iterations to perform.
tol (float, optional) – Convergence threshold. EM will stop if the gain in log-likelihood is below this value.
verbose (bool, optional) – Whether per-iteration convergence reports are printed to sys.stderr. Convergence can also be diagnosed using the monitor_ attribute.
params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars. Defaults to all parameters.
init_params (string, optional) – The parameters that get updated during (params) or initialized before (init_params) the training. Can contain any combination of ‘s’ for startprob, ‘t’ for transmat, ‘m’ for means, and ‘c’ for covars. Defaults to all parameters.
implementation (string, optional) – Determines if the forward-backward algorithm is implemented with logarithms (“log”), or using scaling (“scaling”). The default is to use logarithms for backwards compatability.

property covars_#: Return covars as a full matrix.

decode(X, lengths=None, algorithm=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.
algorithm ({"viterbi", "map"}, optional) –
Decoder algorithm.
- ”viterbi”: finds the most likely sequence of states, given all emissions.
- ”map” (also known as smoothing or forward-backward): finds the sequence of the individual most-likely states, given all emissions.
If not given, decoder is used.

Returns:

log_prob (float) – Log probability of the produced state sequence.
state_sequence (array, shape (n_samples, )) – Labels for each sample from X obtained via a given decoder algorithm.

See also

score_samples: Compute the log probability under the model and posteriors.
score: Compute the log probability under the model.

fit(X, lengths=None)#

Estimate model parameters.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, )) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

self (object) – Returns self.

get_metadata_routing()#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:: routing (MetadataRequest) – A MetadataRequest encapsulating routing information.

property means_#: Compat for _BaseGaussianHMM. We return the mean of the approximating distribution, which for us is just means_posterior_

predict(X, lengths=None)#

Find most likely state sequence corresponding to X.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

state_sequence (array, shape (n_samples, )) – Labels for each sample from X.

predict_proba(X, lengths=None)#

Compute the posterior probability for each state in the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample from X.

sample(n_samples=1, random_state=None, currstate=None)#

Generate random samples from the model.

Parameters:

n_samples (int) – Number of samples to generate.
random_state (RandomState or an int seed) – A random number generator instance. If None, the object’s random_state is used.
currstate (int) – Current state, as the initial state of the samples.

Returns:

X (array, shape (n_samples, n_features)) – Feature matrix.
state_sequence (array, shape (n_samples, )) – State sequence produced by the model.

Examples

# generate samples continuously
_, Z = model.sample(n_samples=10)
X, Z = model.sample(n_samples=10, currstate=Z[-1])

score(X, lengths=None)#

Compute the log probability under the model.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.

See also

score_samples: Compute the log probability under the model and posteriors.
decode: Find most likely state sequence corresponding to X.

score_samples(X, lengths=None)#

Compute the log probability under the model and compute posteriors.

Parameters:

X (array-like, shape (n_samples, n_features)) – Feature matrix of individual samples.
lengths (array-like of integers, shape (n_sequences, ), optional) – Lengths of the individual sequences in X. The sum of these should be n_samples.

Returns:

log_prob (float) – Log likelihood of X.
posteriors (array, shape (n_samples, n_components)) – State-membership probabilities for each sample in X.

See also

score: Compute the log probability under the model.
decode: Find most likely state sequence corresponding to X.

set_fit_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalGaussianHMM#

Request metadata passed to the fit method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to fit if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to fit.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in fit.
Returns:: self (object) – The updated object.

set_predict_proba_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalGaussianHMM#

Request metadata passed to the predict_proba method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict_proba if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict_proba.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict_proba.
Returns:: self (object) – The updated object.

set_predict_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalGaussianHMM#

Request metadata passed to the predict method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to predict if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to predict.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in predict.
Returns:: self (object) – The updated object.

set_score_request(*, lengths: bool | None | str = '$UNCHANGED$') → VariationalGaussianHMM#

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.
False: metadata is not requested and the meta-estimator will not pass it to score.
None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.
str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:: lengths (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for lengths parameter in score.
Returns:: self (object) – The updated object.