Warning

Loading data structures using pickle or dill is not safe against erroneous or maliciously constructed data. In order to avoid the possible injection of malicious data, we provide the md5sum for each available file.

Stochastic volatility¶

We apply the sequential inference algorithm outlined here to the exchange rate of different assets.

We model the log-volatility $${\bf Z}_{\Lambda}$$ of the return of a financial asset at times $$\Lambda=\{0,1,\ldots,n\}$$ with the autoregressive process

$\begin{split}{\bf Z}_{k+1} = \mu + \phi ({\bf Z}_k - \mu) + \varepsilon_k \;, \qquad \varepsilon_k \sim \mathcal{N}(0,\sigma^2) \;, \quad \varepsilon_k {\perp\!\!\!\perp} {\bf Z}_k \\ \qquad \left.{\bf Z}_0 \right\vert \mu,\sigma,\phi \sim \mathcal{N}\left(\mu, \frac{\sigma^2}{1-\phi^2}\right) \;, \qquad \mu \sim \mathcal{N}(0,1) \;\, \\ \qquad \phi = 2 \frac{\exp(\phi^\star)}{1+\exp(\phi^\star)} - 1 \;, \qquad \phi^\star \sim \mathcal{N}(3,1) \;, \\ \sigma^2 \sim \text{InvGamma}(\alpha=1, \beta=0.1) \;.\end{split}$

For $$k \in \Xi \subset \Lambda$$, estimate parameters $$\Theta = (\mu,\phi)$$ and states $$\left\{ {\bf Z}_k \right\}$$, given observations

${\bf Y}_k = \xi_k \exp\left(\frac{1}{2}{\bf Z}_k\right) \;, \qquad \xi_k \sim \mathcal{N}(0,1) \;, \quad \xi_k {\perp\!\!\!\perp} {\bf Z}_k \;.$

Exchange rate GBP - USD¶

We consider here the exchange rates between British Pound (GBP) and US Dollar (USD). These data and results are part of the paper TM4.

State and parameters estimation 10/01/81 - 06/27/85¶

First we consider the problem of estimating the parameters $$\mu, \phi$$ and states $${\bf Z}_{1:945}$$ of the stochastic volatility model using the 945 observations of the daily returns associated to the GBP-USD exchange rates from 10/01/81 till 06/27/85. We fix the variance of the dynamics to $$\sigma=1/4$$. The same problem has been analyzed also in OR13 and OR14. We provide a number of files which can be used to reproduce the results in TM4.

• DurbinData.csv [md5sum: a8a223904ded9d3f19d4a3c5946541ed]: daily returns
• Distribution.dill [md5sum: ad8fd058693939a81207ea812fb44fca]: SequentialHiddenMarkovChainDistribution $$\pi\left( \left. \Theta, {\bf Z}_\Lambda \right\vert {\bf y}_\Xi \right) \propto \mathcal{L}\left({\bf y}_\Xi \left\vert \Theta, {\bf Z}_\Lambda\right.\right) \pi\left( \Theta, {\bf Z}_\Lambda \right)$$
• runner.sh [md5sum: 1471af8891c113e9851c996fbbab374b]: script used to construct the sequential map and obatin all the results. The script was run in parallel on 8 machine for a total of 128 cores.
• Sequential-map.dill [md5sum: eb7b4d90cd020a2dd237671e61a0f80e]: this contains the output of the script tmap-sequential-tm. It includes the base distribution $$\rho=\mathcal{N}(0,{\bf I})$$, the target distribution $$\pi\left( \left. \Theta, {\bf Z}_\Lambda \right\vert {\bf y}_\Xi \right) \propto \mathcal{L}\left({\bf y}_\Xi \left\vert \Theta, {\bf Z}_\Lambda\right.\right) \pi\left( \Theta, {\bf Z}_\Lambda \right)$$, the map $$T$$ such that $$T_\sharp \rho \approx \pi\left( \left. \Theta, {\bf Z}_\Lambda \right\vert {\bf y}_\Xi \right)$$, and the TransportMapSmoother used for the construction.
• Sequential-map-POST.dill [md5sum: 50c66da9e5b74792db931ac53459e906]: data structure used as output of the script tmap-sequential-postprocess.
• Sequential-map-POST.dill.hdf5 [md5sum: 1d0725ad889fe86f3e2f2c02fe7169b9]: dataset containing the output of tmap-sequential-postprocess. The data is structured as follows:
• filtering: list of samples from the approximate filtering distributions $$\pi\left(\Theta, {\bf Z}_k \middle\vert {\bf y}_{1:k}\right)$$ for $$k\in\Lambda$$.
• metropolis-independent-proposal-samples/skip-10: Monte Carlo Markov Chain $$10^5$$ long, obtained with MetropolisHastingsIndependentProposalsSampler, by subsampling every 10 samples.
• x: Monte Carlo Markov Chain with invariant $$\pi\left( \Theta, {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right)$$.
• s: Monte Carlo Markov Chain with invariant $$T^\sharp \pi\left( \Theta, {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right)$$.
• quadrature: Monte Carlo samples from $$T_\sharp\rho \approx \pi\left( \Theta, {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right)$$.
• vals_var_diag: values $$\{\log\rho({\bf x}_i)\}$$ and $$\{\log T^\sharp\pi({\bf x}_i)\}$$ used to compute the variance diagnostic $$\mathbb{V}\left[\log\frac{\rho}{T^\sharp\pi}\right]$$.
• trim-%i: postprocessing of the approximation of the trimmed distribution $$\pi\left( \Theta, {\bf Z}_{\Lambda<i}\, \middle\vert {\bf y}_{\Xi<i}\, \right)$$.
• metropolis-independent-proposal-samples/skip-10: Monte Carlo Markov Chain $$10^5$$ long, obtained with MetropolisHastingsIndependentProposalsSampler, by subsampling every 10 samples.
• vals_var_diag: values used to compute the variance diagnostic.

In the following we report some of the results obtained. For a complete treatment we refer to TM4. Mean and $$\{5,95\}$$ percentiles of the approximate filtering marginals $$\pi\left({\bf Z}_k \middle\vert {\bf y}_{1:k}\right)$$ (blue) along with one realization (black) Mean and $$\{5,95\}$$ percentiles of the approximate smoothing marginals $$[T_\sharp\rho]_k \approx \pi\left({\bf Z}_k \middle\vert {\bf y}_\Xi\right)$$ (red) along with one realization (black) Mean and $$\{5,95\}$$ percentiles of the approximate (red) and exact (black) smoothing marginals obtained with the map $$T$$ and Markov Chain Monte Carlo respectively (xy)-axis: mean and $$\{5,25,40,60,75,95\}$$ percentiles of the approximate filtering marginal $$[T_\sharp\rho]_{\mu} \approx \pi\left(\mu\middle\vert {\bf y}_{1:k}\right)$$ of the hyper-parameter $$\mu$$. (xyz)-axis: for a subset steps $$k$$, we show the density of the approximate (solid lines) and the exact (dashed lines) filtering marginal obtained with Markov Chain Monte Carlo (xy)-axis: mean and $$\{5,25,40,60,75,95\}$$ percentiles of the approximate filtering marginal $$[T_\sharp\rho]_{\phi} \approx \pi\left(\phi\middle\vert {\bf y}_{1:k}\right)$$ of the hyper-parameter $$\phi$$. (xyz)-axis: for a subset steps $$k$$, we show the density of the approximate (solid lines) and the exact (dashed lines) filtering marginal obtained with Markov Chain Monte Carlo (shaded) $$\{5,25,40,60,75,95\}$$ percentiles of the posterior predictive (conditioned on all the data). (dots) data.

Filtering and smoothing 10/01/1981 - 08/24/2017¶

Here we fix the hyper-parameters $$\mu,\phi$$ of the stochastic volatility model to the medians $$\mu=0.667$$ and $$\phi=0.879$$ found through the preceding analysis of the first 945 steps, and apply the algorithm for filtering and smoothing on an extended dataset of 9009 observations from 10/01/1981 till 08/24/2017. This means that we will sequentially construct 9008 two dimensional maps in order to approximate the full posterior $$\pi\left({\bf Z}_{1:9009}\middle\vert {\bf y}_{1:9009}\right)$$ and the filtering distributions $$\pi\left({\bf Z}_{k}\middle\vert {\bf y}_{1:k}\right)$$ for $$k=1,\ldots,9009$$. This setting is also described in TM4. Here we provide the dataset used and the results obtained.

• GBP-USD.csv [md5sum: 195a260b45b113051756d1297f082714]: daily returns
• Distribution.dill [md5sum: 65c4cc50ff8eb6200cfc373523dad46a]: SequentialHiddenMarkovChainDistribution $$\pi\left({\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right) \propto \mathcal{L}\left({\bf y}_\Xi \middle\vert {\bf Z}_\Lambda\right) \pi\left( {\bf Z}_\Lambda \right)$$
• runner.sh [md5sum: 51de28a3588809bbe8965646b7a4d0a4]: script used to construct the sequential map and obatin all the results. The script was run in parallel on one machine with 10 cores.
• Sequential-map.dill [md5sum: ecff7757ea414f045259e8e5caca903b]: this contains the output of the script tmap-sequential-tm. It includes the base distribution $$\rho=\mathcal{N}(0,{\bf I})$$, the target distribution $$\pi\left( {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right) \propto \mathcal{L}\left({\bf y}_\Xi \middle\vert {\bf Z}_\Lambda\right) \pi\left( {\bf Z}_\Lambda \right)$$, the map $$T$$ such that $$T_\sharp \rho \approx \pi\left( {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right)$$, and the TransportMapSmoother used for the construction.
• Sequential-map-POST.dill [md5sum: 72a755383fba437e4dead6ff3e3d81e3]: data structure used as output of the script tmap-sequential-postprocess.
• Sequential-map-POST.dill.hdf5 [md5sum: d1b5686c3680f623b8cba2764c92eb0c]: dataset containing the output of tmap-sequential-postprocess. The data is structured as follows:
• filtering: list of samples from the approximate filtering distributions $$\pi\left({\bf Z}_k \middle\vert {\bf y}_{1:k}\right)$$ for $$k\in\Lambda$$.
• metropolis-independent-proposal-samples/skip-10: Monte Carlo Markov Chain $$10^5$$ long, obtained with MetropolisHastingsIndependentProposalsSampler, by subsampling every 10 samples.
• x: Monte Carlo Markov Chain with invariant $$\pi\left( {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right)$$.
• s: Monte Carlo Markov Chain with invariant $$T^\sharp \pi\left( {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right)$$.
• quadrature: Monte Carlo samples from $$T_\sharp\rho \approx \pi\left( {\bf Z}_\Lambda \middle\vert {\bf y}_\Xi \right)$$.

The following images show the smoothing marginals at different timesteps. We makred some historical events to put this results into context. If you, by any chance, have a better historical insight on the evolution of the volatlity for certain periods, we would be happy to know it.   Mean and $$\{5,95\}$$ percentiles of the approximate (red) and exact (black) smoothing marginals obtained with the map $$T$$ and Markov Chain Monte Carlo respectively