{
  "cells": [
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "%matplotlib inline"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Using a Hidden Markov Model with Poisson Emissions to Understand Earthquakes\n\nLet's look at data of magnitude 7+ earthquakes between 1900-2006 in the\nworld collected by the US Geological Survey as described in this textbook:\nZucchini & MacDonald, \"Hidden Markov Models for Time Series\"\n(https://ayorho.files.wordpress.com/2011/05/chapter1.pdf). The goal is to\nsee if we can separate out different tectonic processes that cause\nearthquakes based on their frequency of occurance. The idea is that each\ntectonic boundary may cause earthquakes with a particular distribution\nof waiting times depending on how active it is. This might tell help us\npredict future earthquake danger, espeically on a geological time scale.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import numpy as np\nimport matplotlib.pyplot as plt\n\nfrom scipy.stats import poisson\nfrom hmmlearn import hmm\n\n# earthquake data from http://earthquake.usgs.gov/\nearthquakes = np.array([\n    13, 14, 8, 10, 16, 26, 32, 27, 18, 32, 36, 24, 22, 23, 22, 18,\n    25, 21, 21, 14, 8, 11, 14, 23, 18, 17, 19, 20, 22, 19, 13, 26,\n    13, 14, 22, 24, 21, 22, 26, 21, 23, 24, 27, 41, 31, 27, 35, 26,\n    28, 36, 39, 21, 17, 22, 17, 19, 15, 34, 10, 15, 22, 18, 15, 20,\n    15, 22, 19, 16, 30, 27, 29, 23, 20, 16, 21, 21, 25, 16, 18, 15,\n    18, 14, 10, 15, 8, 15, 6, 11, 8, 7, 18, 16, 13, 12, 13, 20,\n    15, 16, 12, 18, 15, 16, 13, 15, 16, 11, 11])\n\n\n# Plot the sampled data\nfig, ax = plt.subplots()\nax.plot(earthquakes, \".-\", ms=6, mfc=\"orange\", alpha=0.7)\nax.set_xticks(range(0, earthquakes.size, 10))\nax.set_xticklabels(range(1906, 2007, 10))\nax.set_xlabel('Year')\nax.set_ylabel('Count')\nfig.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Now, fit a Poisson Hidden Markov Model to the data.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "scores = list()\nmodels = list()\nfor n_components in range(1, 5):\n    for idx in range(10):  # ten different random starting states\n        # define our hidden Markov model\n        model = hmm.PoissonHMM(n_components=n_components, random_state=idx,\n                               n_iter=10)\n        model.fit(earthquakes[:, None])\n        models.append(model)\n        scores.append(model.score(earthquakes[:, None]))\n        print(f'Converged: {model.monitor_.converged}\\t\\t'\n              f'Score: {scores[-1]}')\n\n# get the best model\nmodel = models[np.argmax(scores)]\nprint(f'The best model had a score of {max(scores)} and '\n      f'{model.n_components} components')\n\n# use the Viterbi algorithm to predict the most likely sequence of states\n# given the model\nstates = model.predict(earthquakes[:, None])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Let's plot the waiting times from our most likely series of states of\nearthquake activity with the earthquake data. As we can see, the\nmodel with the maximum likelihood had different states which may reflect\ntimes of varying earthquake danger.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# plot model states over time\nfig, ax = plt.subplots()\nax.plot(model.lambdas_[states], \".-\", ms=6, mfc=\"orange\")\nax.plot(earthquakes)\nax.set_title('States compared to generated')\nax.set_xlabel('State')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Fortunately, 2006 ended with a period of relative tectonic stability, and,\nif we look at our transition matrix, we can see that the off-diagonal terms\nare small, meaning that the state transitions are rare and it's unlikely that\nthere will be high earthquake danger in the near future.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "fig, ax = plt.subplots()\nax.imshow(model.transmat_, aspect='auto', cmap='spring')\nax.set_title('Transition Matrix')\nax.set_xlabel('State To')\nax.set_ylabel('State From')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Finally, let's look at the distribution of earthquakes compared to our\nwaiting time parameter values. We can see that our model fits the\ndistribution fairly well, replicating results from the reference.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# get probabilities for each state given the data, take the average\n# to find the proportion of time in that state\nprop_per_state = model.predict_proba(earthquakes[:, None]).mean(axis=0)\n\n# earthquake counts to plot\nbins = sorted(np.unique(earthquakes))\n\nfig, ax = plt.subplots()\nax.hist(earthquakes, bins=bins, density=True)\nax.plot(bins, np.dot(poisson.pmf(bins, model.lambdas_).T,\n                     prop_per_state[:, None]))\nax.set_title('Histogram of Earthquakes with Fitted Poisson States')\nax.set_xlabel('Number of Earthquakes')\nax.set_ylabel('Proportion')\n\nplt.show()"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.9"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}