HMM Module

class visualize_training.hmm.HMM(max_components, cov_type, n_seeds, n_iter, seeds=None)

The HMM is one of the core modules of Visualization Training. It is a wrapper that stores all the functionalities required to train, process and infer from the HMM models. It contains methods for data preparation, calculating average log likelood and feature importances to name a few.

feature_importance(cols, data, best_predictions, phases, lengths, top_n=3)

Return Feature Importance of all transitions for the best model

Parameters:
  • cols (list) – list of columns of interest

  • data (list) – list of dataframes

  • best_predictions (list) – list of best predictions over different

  • dataframes

Returns:

feature importance of all transitions and avg mean difference due to them

Return type:

transitions

get_avg_log_likelihood(data_dir, cols, sort=True, sort_col='epoch', first_n=None, test_size=0.2, seed=0)

Wrapper function which reads, prepares data for model training, trains all the models for all the possible n_components values.

Parameters:
  • data_dir (str) – Path to data files.

  • cols (list) – List of columns to be returned.

  • sort (bool) – Whether to sort the rows based on sort_col or not. Defaults to True.

  • sort_col (str) – Column name based on sorting needs to be done. Defaults to “epoch”.

  • first_n (int) – No of rows to be returned for each data file. Defaults to None.

  • test_size (float, optional) – Size of test set as fraction of the total dataset. Defaults to 0.2.

  • seed (int, optional) – Random Seed. Defaults to 0.

Returns:

Dictionary containing:
  • best_scores (list): List of best scores for all the components

  • mean_scores (list): List of mean scores (average across all seeds) for all the components

  • scores_stdev (list): List of std dev (calculated across all seeds) for all the components

  • aics (list): List of mean AIC values (calculated across all seeds) for all the components

  • bics (list): List of mean BIC values (calculated across all seeds) for all the components

  • best_models (list): List of best models for all the components

  • best_model: Best model out of all the models

Return type:

(Dict)