In [1]:
! rm *.dill *.dill.hdf5 *.svg
rm: cannot remove '*.dill.hdf5': No such file or directory
rm: cannot remove '*.svg': No such file or directory

High-level Command Line Interface (CLI)

We provide here a short tutorial of some useful command line scripts (bash/sh terminal) that allow an user-friendly access to the potential of the TransportMap module.

One needs first to store a Distribution object. Then, the following scripts can be applied to it:

We will use the Stochastic Volatility model (\(d=6\)) as an example and store the distribution \(\nu_\pi\) in the file Distribution.dill.

In [2]:
import TransportMaps.Distributions.Examples.StochasticVolatility as SV
yt,zt = SV.generate_data(4, -.5, .25, .95)
yt[2] = None   # Missing data
d = SV.StocVolHyperDistribution(True, False, True, sigma=.25)
for n in range(4):
    d.assimilate(y=yt[n], Xt=zt[n])
d.store('Distribution.dill', force=True)

Note: In the following we will use the esclamation point ! to indicate the execution of command line scripts

In [3]:
! ls
cli.ipynb  Distribution.dill  xml

tmap-laplace

This scripts builds a Laplace approximation and the corresponding linear map, and stores it on an output file. Let’s check its syntax…

In [4]:
! tmap-laplace

Usage: tmap-laplace [-h -f -I] --dist=DIST --output=OUTPUT
  [--tol=TOL --ders=DERS --log=LOG]

2017-06-18 16:47:42 ERROR: Options --dist and --output must be specified

We can check its extended documentation using the flag -h

In [5]:
! tmap-laplace -h

Usage: tmap-laplace [-h -f -I] --dist=DIST --output=OUTPUT
  [--tol=TOL --ders=DERS --log=LOG]

DESCRIPTION
Given a file (--dist) storing the target distribution, generate the linear map
corresponding to the Laplace approximation of it.
All files involved are stored and loaded using the python package dill.

OPTIONS
  --dist=DIST             file containing the target distribution
  --output=OUTPUT         output file containing the linear transport map,
                          the base distribution (standard normal), the target distribution
                          and all the additional parameters used for the
                          construction
  --tol=TOL               optimization tolerance (default: 1e-4)
  --ders=DERS             derivatives to be used in the optimization
                          Available options:
                            0: BFGS (gradient free)
                            1: BFGS (gradient needed)
                            2: Newton-CG (Hessian needed)
  --log=LOG               log level (default=30). Uses package logging.
                          Available options:
                            10: debug
                            20: info
                            30: warning
                            40: error
                            50: critical
  -f                     force overwrite of OUTPUT file
  -I                      enter interactive mode after finishing
  -h                      print this help

Let us run tmap-laplace on the stored density…

In [6]:
! tmap-laplace --dist=Distribution.dill --output=Laplace.dill --log=20
2017-06-18 16:47:47 INFO:TM.TransportMaps: Optimization terminated successfully
2017-06-18 16:47:47 INFO:TM.TransportMaps:   Function value:          -2.147868
2017-06-18 16:47:47 INFO:TM.TransportMaps:   Norm of the Jacobian:    0.000166
2017-06-18 16:47:47 INFO:TM.TransportMaps:   Number of iterations:         6
2017-06-18 16:47:47 INFO:TM.TransportMaps:   N. function evaluations:      7
2017-06-18 16:47:47 INFO:TM.TransportMaps:   N. Jacobian evaluations:     12
2017-06-18 16:47:47 INFO:TM.TransportMaps:   N. Hessian evaluations:       6
In [7]:
! ls
cli.ipynb  Distribution.dill  Laplace.dill  xml

tmap-tm

This script builds the map \(T\) such that \(T_\sharp \nu_\rho \approx \nu_\pi\), where \(\nu_\rho\) and \(\nu_\pi\) are two Distributions which provide the implementation of the necessary methods (densities and their derivatives) for solving

\[T^\star = \arg\min_{T \in \mathcal{T}} \mathcal{D}_{\rm KL}\left( T_\sharp \nu_\rho \middle\vert \nu_\pi \right)\]

Let’s check its syntax…

In [8]:
! tmap-tm

Usage: tmap-tm [-h -f -I]
  --dist=DIST --output=OUTPUT [--base-dist=BASE_DIST]
  (--mtype=MTYPE --span=SPAN --btype=BTYPE --order=ORDER)
    / (--map-descr=MAP_DESCR)
  --qtype=QTYPE --qnum=QNUM
  [--tol=TOL --with-reg=REG --ders=DERS]
  [--laplace-pull]
  [--log=LOG --nprocs=NPROCS --batch=BATCH]

2017-06-18 16:47:50 ERROR: Options --dist and --output must be specified

We can check its extended documentation using the -h flag …

In [9]:
! tmap-tm -h

Usage: tmap-tm [-h -f -I]
  --dist=DIST --output=OUTPUT [--base-dist=BASE_DIST]
  (--mtype=MTYPE --span=SPAN --btype=BTYPE --order=ORDER)
    / (--map-descr=MAP_DESCR)
  --qtype=QTYPE --qnum=QNUM
  [--tol=TOL --with-reg=REG --ders=DERS]
  [--laplace-pull]
  [--log=LOG --nprocs=NPROCS --batch=BATCH]

DESCRIPTION
Given a file (--dist) storing the target distribution, produce the transport map that
pushes forward the base distribution (default: standard normal) to the target distribution.
All files involved are stored and loaded using the python package dill.

OPTIONS - input/output:
  --dist=DIST             path to the file containing the target distribution
  --output=OUTPUT         path to the output file containing the transport map,
                          the base distribution, the target distribution and all
                          the additional parameters used for the construction
  --base-dist=BASE_DIST   path to the file containing the base distribution
                          (default: a standard normal of suitable dimension)
OPTIONS - map description (using default maps):
  --mtype=MTYPE           monotone format for the transport
                          Available options:
                            intexp: integrated exponential
                            intsq: integrated square
                            linspan: monotone linear span
  --span=SPAN             span type for all the components of the map
                          Available options:
                            full: full span
                            total: total order span
  --btype=BTYPE           basis types for all the components of the map
                          Available options:
                            poly: polynomial basis
                            rbf: radial basis functions (requires SPAN=full)
  --order=ORDER           order of the transport map
OPTIONS - map description (manual):
  --map-descr=MAP_DESCR   XML file containing the skeleton of the transport map
OPTIONS - solver:
  --qtype=QTYPE           quadrature type for the discretization of the KL-divergence
                          Available options:
                            0: Monte Carlo
                            3: Gauss quadrature
  --qnum=QNUM             quadrature level
  --tol=TOL               optimization tolerance (default: 1e-4)
  --reg=REG               a float L2 regularization parameter
                          (default: no regularization)
  --ders=DERS             derivatives to be used in the optimization
                          Available options:
                            0: BFGS (gradient free)
                            1: BFGS (gradient needed)
                            2: Newton-CG (Hessian needed)
  --laplace-pull          whether to precondition pulling back the target through
                             its Laplace approximation
  --log=LOG               log level (default=30). Uses package logging.
                          Available options:
                            10: debug
                            20: info
                            30: warning
                            40: error
                            50: critical
  --nprocs=NPROCS         number of processors to be used (default=1)
  --batch=BATCH           list of batch sizes for function evaluation, gradient
                          evaluation and Hessian evaluation
OPTIONS - other:
  -f                      force overwrite of OUTPUT file
  -I                      enter interactive mode after finishing
  -h                      print this help

Let us build an IsotropicIntegratedSquaredTriangularTransportMap of total order 2 such that \(T_\sharp \nu_\rho \approx \nu_\pi\), where \(\nu_\rho = \mathcal{N}(0,{\bf I})\).

In [10]:
! tmap-tm --dist=Distribution.dill --output=Direct-IntSq.dill \
    --mtype=intsq --span=total --btype=poly --order=2 \
    --qtype=3 --qnum=3,3,3,3,3,3 --log=20
INFO:root:Number coefficients: 83
In [11]:
! ls
cli.ipynb  Direct-IntSq.dill  Distribution.dill  Laplace.dill  xml

The same map \(T\) (and more complex ones) can be defined using XML descriptors

In [12]:
! tmap-tm --dist=Distribution.dill --output=Direct-IntSq.dill -f \
    --map-descr=xml/IntSq-d6-o2.xml \
    --qtype=3 --qnum=3,3,3,3,3,3 --log=20
INFO:root:Number coefficients: 83

tmap-sequential-tm

This script builds a transport map pushing forward \(\mathcal{N}(0,{\bf I})\) to the sequential Hidden Markov Chain distribution \(\pi\) using the algorithm for decomposable transports.

Let us check its syntax …

In [13]:
! tmap-sequential-tm

Usage: tmap-sequential-tm [-h -f -I]
  --dist=DIST --output=OUTPUT
  (--mtype=MTYPE --span=SPAN --btype=BTYPE --order=ORDER)
    / (--map-0-descr=MAP_DESCR --map-descr=MAP_DESCR)
  --qtype=QTYPE --qnum=QNUM
  [--tol=TOL --with-reg=REG --ders=DERS]
  [(--hyper-mtype=MTYPE --hyper-span=SPAN --hyper-btype=BTYPE --hyper-order=ORDER)
    / (--hyper-map-descr=MAP_DESCR)
   --hyper-qtype=QTYPE --hyper-qnum=QNUM
   --hyper-tol=TOL --hyper-with-reg=REG]
  [--safe-mode --reload --log=LOG --nprocs=NPROCS --batch=BATCH]

2017-06-18 16:48:37 ERROR: Options --dist and --output must be specified
In [14]:
! tmap-sequential-tm -h

Usage: tmap-sequential-tm [-h -f -I]
  --dist=DIST --output=OUTPUT
  (--mtype=MTYPE --span=SPAN --btype=BTYPE --order=ORDER)
    / (--map-0-descr=MAP_DESCR --map-descr=MAP_DESCR)
  --qtype=QTYPE --qnum=QNUM
  [--tol=TOL --with-reg=REG --ders=DERS]
  [(--hyper-mtype=MTYPE --hyper-span=SPAN --hyper-btype=BTYPE --hyper-order=ORDER)
    / (--hyper-map-descr=MAP_DESCR)
   --hyper-qtype=QTYPE --hyper-qnum=QNUM
   --hyper-tol=TOL --hyper-with-reg=REG]
  [--safe-mode --reload --log=LOG --nprocs=NPROCS --batch=BATCH]

DESCRIPTION
Given a file (--dist) storing the target distribution, produce the transport map that
pushes forward the base distribution (default: standard normal) to the target distribution,
using the algorithm for sequential low-dimensional couplings.
The input distribution must be SequentialHiddenMarkcovChainDistribution with
hdim hyperparameters and sdim state dimension.
All files involved are stored and loaded using the python package dill.

OPTIONS - input/output:
  --dist=DIST             path to the file containing the target distribution
  --output=OUTPUT         path to the output file containing the transport map,
                          the base distribution, the target distribution and all
                          the additional parameters used for the construction
OPTIONS - map description (using default maps):
  --mtype=MTYPE           monotone format for the transport
                          Available options:
                            intexp: integrated exponential
                            intsq: integrated square
                            linspan: monotone linear span
  --span=SPAN             span type for all the components of the map
                          Available options:
                            full: full span
                            total: total order span
  --btype=BTYPE           basis types for all the components of the map
                          Available options:
                            poly: polynomial basis
                            rbf: radial basis functions (requires SPAN=full)
  --order=ORDER           order of the transport map
OPTIONS - map description (manual):
  --map-0-descr=MAP_DESCR XML file with the skeleton of the zero-th transport map (hdim+sdim)
  --map-descr=MAP_DESCR   XML file with the skeleton of the transport map (hdim+2*sdim)
OPTIONS - KL-minimization solver:
  --qtype=QTYPE           quadrature type for the discretization of the KL-divergence
                          Available options:
                            0: Monte Carlo
                            3: Gauss quadrature
  --qnum=QNUM             quadrature level
  --tol=TOL               optimization tolerance (default: 1e-4)
  --reg=REG               a float L2 regularization parameter
                          (default: no regularization)
  --ders=DERS             derivatives to be used in the optimization
                          Available options:
                            0: BFGS (gradient free)
                            1: BFGS (gradient needed)
                            2: Newton-CG (Hessian needed)
OPTIONS - hyper-parameters map description (using default maps):
  --hyper-mtype=MTYPE     monotone format for the transport
                          Available options:
                            intexp: integrated exponential
                            intsq: integrated square
                            linspan: monotone linear span
  --hyper-span=SPAN       span type for all the components of the map
                          Available options:
                            full: full span
                            total: total order span
  --hyper-btype=BTYPE     basis types for all the components of the map
                          Available options:
                            poly: polynomial basis
                            rbf: radial basis functions (requires SPAN=full)
  --hyper-order=ORDER     order of the transport map
OPTIONS - hyper-parameters map description (manual):
  --hyper-map-descr=MAP_DESCR   XML file containing the skeleton of the transport map
OPTIONS - regression solver:
  --hyper-qtype=QTYPE     quadrature type for the discretization of the KL-divergence
                          Available options:
                            0: Monte Carlo
                            3: Gauss quadrature
  --hyper-qnum=QNUM       quadrature level
  --hyper-tol=TOL         optimization tolerance (default: 1e-4)
  --hyper-reg=REG         a float L2 regularization parameter
                          (default: no regularization)
OPTIONS - other:
  --safe-mode             always store intermediate maps (allows for restarting)
  --reload                automatically resume from stored data (conflicts with -f)
  --log=LOG               log level (default=30). Uses package logging.
                          Available options:
                            10: debug
                            20: info
                            30: warning
                            40: error
                            50: critical
  --nprocs=NPROCS         number of processors to be used (default=1)
  --batch=BATCH           list of batch sizes for function evaluation, gradient
                          evaluation and Hessian evaluation
  -v                      verbose output (not affecting --log)
  -f                      force overwrite of OUTPUT file
  -I                      enter interactive mode after finishing
  -h                      print this help

In [15]:
! tmap-sequential-tm --dist=Distribution.dill --output=Sequential-IntSq.dill -v -f \
    --mtype=intsq --span=total --btype=poly --order=3 \
    --qtype=3 --qnum=4,4,4,4 \
    --hyper-mtype=intsq --hyper-span=total --hyper-btype=poly --hyper-order=4 \
    --hyper-qtype=3 --hyper-qnum=5,5
2017-06-18 16:48:43 Step     0    [DONE]
2017-06-18 16:48:44 Step     1    [DONE]
2017-06-18 16:48:48 Step     2    [DONE]
2017-06-18 16:48:58 Step     3    [DONE]
In [16]:
! ls
cli.ipynb          Distribution.dill  Sequential-IntSq.dill
Direct-IntSq.dill  Laplace.dill       xml

The same maps \(R_i\) and \(\mathfrak{H}_i\) (and more complex ones) can be defined using XML descriptors

In [17]:
! tmap-sequential-tm --dist=Distribution.dill --output=Sequential-IntSq.dill -v -f \
    --map-0-descr=xml/IntSq-d3-o3.xml \
    --map-descr=xml/IntSq-d4-o3.xml \
    --qtype=3 --qnum=4,4,4,4 \
    --hyper-map-descr=xml/IntSq-d2-o4.xml \
    --hyper-qtype=3 --hyper-qnum=5,5
2017-06-18 16:49:13 Step     0    [DONE]
2017-06-18 16:49:13 Step     1    [DONE]
2017-06-18 16:49:16 Step     2    [DONE]
2017-06-18 16:49:27 Step     3    [DONE]

tmap-postprocess

This script provides a number of diagnostics and postprocessing routines. We test it on the output Sequential-IntSq.dill generated by tmap-sequential-tm. However this could be applied directly also to the output Laplace.dill and Direct-IntSq.dill generated by tmap-laplace and tmap-tm respectively.

Let us first check its syntax …

In [18]:
! tmap-postprocess

Usage: tmap-postprocess [-h -v -I]
  --data=DATA --output=OUTPUT
  [--store-fig-dir=DIR --store-fig-fmats=FMATS
   --extra-tit=TITLE --no-plotting
   --aligned-conditionals=DIST
     --alc-n-points-x-ax=N --alc-n-tri-plots=N
     --alc-anchor=LIST --alc-range=LIST
   --random-conditionals=DIST
     --rndc-n-points-x-ax=N --rndc-n-plots-x-ax=N
     --rndc-anchor=LIST --rndc-range=LIST
   --var-diag=DIST
     --var-diag-qtype=QTYPE --var-diag-qnum=QNUM
   --aligned-marginals=DIST
     --alm-n-points=N --alm-n-tri-plots=N
   --quadrature=DIST
     --quadrature-qtype=QTYPE
     --quadrature-qnum=QNUM
   --importance-samples=NSAMP
   --metropolis-samples=NSAMP
     --metropolis-burnin=BURNIN
     --metropolis-ess-q=QUANTILE
     --metropolis-ess-plot-lag=LAG
     --metropolis-ess-xcorr
   --log=LOG --batch=BATCH --nprocs=NPROCS]

2017-06-18 16:49:41 ERROR: Option --data and --output must be specified
In [19]:
! tmap-postprocess -h

Usage: tmap-postprocess [-h -v -I]
  --data=DATA --output=OUTPUT
  [--store-fig-dir=DIR --store-fig-fmats=FMATS
   --extra-tit=TITLE --no-plotting
   --aligned-conditionals=DIST
     --alc-n-points-x-ax=N --alc-n-tri-plots=N
     --alc-anchor=LIST --alc-range=LIST
   --random-conditionals=DIST
     --rndc-n-points-x-ax=N --rndc-n-plots-x-ax=N
     --rndc-anchor=LIST --rndc-range=LIST
   --var-diag=DIST
     --var-diag-qtype=QTYPE --var-diag-qnum=QNUM
   --aligned-marginals=DIST
     --alm-n-points=N --alm-n-tri-plots=N
   --quadrature=DIST
     --quadrature-qtype=QTYPE
     --quadrature-qnum=QNUM
   --importance-samples=NSAMP
   --metropolis-samples=NSAMP
     --metropolis-burnin=BURNIN
     --metropolis-ess-q=QUANTILE
     --metropolis-ess-plot-lag=LAG
     --metropolis-ess-xcorr
   --log=LOG --batch=BATCH --nprocs=NPROCS]

DESCRIPTION
Given a file (--data) storing the transport map pushing forward a base distribution
to a target distribution, provides a number of diagnositic routines.
All files involved are stored and loaded using the python package dill and
an extra file OUTPUT.hdf5 is created to store big datasets in the hdf5 format.
In the following default values are shown in brackets.

OPTIONS - input/output:
  --data=DATA           path to the file containing the target distribution,
                          the base distribution and the transport map pushing forward
                          the base to the target.
  --output=OUTPUT       path to the file storing all postprocess data.
                          The additional file OUTPUT.hdf5 will be used to store
                          the more memory consuming data.
  --store-fig-dir=DIR   path to the directory where to store the figures.
  --store-fig-fmats=FMATS  figure formats - see matplotlib for supported formats (svg)
  --extra-tit=TITLE     additional title for the figures' file names.
  --no-plotting         do not plot figures, but only store their data.
                          (requires --output or --store-fig-dir)
OPTIONS - Diagnostics:
  --aligned-conditionals=DIST  plot aligned slices of the selected DIST:
                          approx-base: approximate base density
                          approx-target: approximate target density
                          exact-base: exact base density
                          exact-target: exact target density
                        Optional arguments:
    --alc-n-points-x-ax=N  number of discretization points per axis (40)
    --alc-n-tri-plots=N    number of subplots (0)
    --alc-anchor=LIST      list of floats "f1,f2,f3..." for the anchor point (0)
    --alc-range=LIST       list of two floats "f1,f2" for the range (-5,5)
  --random-conditionals=DIST   plot randomly chosen slices of the selected DIST:
                          approx-base: approximate base density
                          approx-target: approximate target density
                          exact-base: exact base density
                          exact-target: exact target density
                        Optional arguments:
    --rndc-n-points-x-ax=N   number of discretization points per axis (40)
    --rndc-n-plots-x-ax=N    number of subplots (0)
    --rndc-anchor=LIST       list of floats "f1,f2,f3..." for the anchor point (0)
    --rndc-range=LIST        list of two floats "f1,f2" for the range (-5,5)
  --var-diag=DIST       compute variance diagostic using the sampling DIST:
                          approx-base: approximate base density
                          approx-target: approximate target density
                          exact-base: exact base density
                          exact-target: exact target density
                        Optional arguments:
    --var-diag-qtype=QTYPE  quadrature type to be used (0)
    --var-diag-qnum=QNUM    level of the quadrature (1000)
OPTIONS - Sampling:
  --aligned-marginals=DIST  plot aligned marginals of the selected DIST:
                          approx-base: approximate base density
                          approx-target: approximate target density
                          exact-base: exact base density
                          exact-target: exact target density
                        Optional arguments:
    --alm-n-points=N         number of samples to be used for the kernel density estimation
    --alm-n-tri-plots=N      number of subplots (0)
  --quadrature=DIST     generate quadrature for the selected DIST:
                          approx-base: approximate base density
                          approx-target: approximate target density
                          exact-base: exact base density
                          exact-target: exact target density
                        Optional arguments:
    --quadrature-qtype=QTYPE  generate quadrature of type QTYPE (0)
    --quadrature-qnum=QNUM  level of the quadrature (int or list)
  --importance-samples=NSAMP  number of importance samples and weights for the approximation
                        of estimators with respect to the target distribution
  --metropolis-samples=NSAMP  length of the chain with invariant distribution the
                        target distribution using Metropolis-Hastings with independent
                        proposals
    --metropolis-burnin=BURNIN   number of samples to be considered as burn-in
    --metropolis-ess-q=QUANTILE  quantile used for the estimation of the sample size (0.99).
                             This is estimated over the worst decaying autocorrelation rate.
    --metropolis-ess-plot-lag=LAG  maximum lag to be plotted (50)
    --metropolis-ess-xcorr       whether to compute also cross-correlation decays
OPTIONS - Computation:
  --log=LOG               log level (default=30). Uses package logging.
                          Available options:
                            10: debug
                            20: info
                            30: warning
                            40: error
                            50: critical
  --nprocs=NPROCS         number of processors to be used (default=1)
  --batch=BATCH           list of batch sizes for function evaluation, gradient
                          evaluation and Hessian evaluation
OPTIONS - other:
  -v                      verbose output (not affecting --log)
  -I                      enter interactive mode after finishing
  -h                      print this help

In [20]:
! tmap-postprocess --data=Sequential-IntSq.dill --output=Sequential-IntSq-Postprocess.dill -v \
    --store-fig-dir=./ --log=20 \
    --aligned-conditionals=approx-base \
    --random-conditionals=approx-base --rndc-n-plots-x-ax=4 \
    --var-diag=exact-base --var-diag-qtype=0 --var-diag-qnum=10000 \
    --aligned-marginals=approx-target --alm-n-points=10000 \
    --quadrature=approx-target --quadrature-qtype=0 --quadrature-qnum=10000 \
    --importance-samples=10000 \
    --metropolis-samples=10000 --metropolis-burnin=5000 --metropolis-ess-plot-lag=50
2017-06-18 16:49:47 [Start] Aligned conditionals approx-base
2017-06-18 16:49:59 [Stop]  Aligned conditionals approx-base
2017-06-18 16:49:59 [Start] Random conditionals approx-base
2017-06-18 16:50:12 [Stop]  Random conditionals approx-base
2017-06-18 16:50:12 [Start] Variance diagnostic exact-base
2017-06-18 16:50:15 [Stop]  Variance Diagnostic exact-base: 3.153136e-02
2017-06-18 16:50:15 [Start] Aligned marginals approx-target - Sample generation
2017-06-18 16:50:16         Aligned marginals approx-target - Plotting
2017-06-18 16:51:45 [Stop]  Aligned marginals approx-target
2017-06-18 16:51:45 [Start] Quadrature 0
2017-06-18 16:51:45 [Stop]  Quadrature
2017-06-18 16:51:45 [Start] Importance sampling
2017-06-18 16:51:49 [Stop]  Importance sampling
2017-06-18 16:51:49 [Start] Metropolis-Hastings with Independent Proposals
2017-06-18 16:52:14         Metropolis-Hastings with Independent Proposals - Estimating ESS
2017-06-18 16:52:15 [Stop]  Metropolis-Hastings with Independent Proposals - ESS: 2230
In [21]:
! ls
cli.ipynb
Direct-IntSq.dill
Distribution.dill
Laplace.dill
Sequential-IntSq-aligned-conditionals-approx-base.svg
Sequential-IntSq-aligned-marginals-approx-target.svg
Sequential-IntSq.dill
Sequential-IntSq-metropolis-ess.svg
Sequential-IntSq-Postprocess.dill
Sequential-IntSq-Postprocess.dill.hdf5
Sequential-IntSq-random-conditionals-approx-base.svg
xml

The command above produced a number of outputs. Let us show the figures generated

In the following we show how to access all these outputs.

Aligned conditionals

These are contained in the figures *-aligned-conditionals-approx-base.*.

Random conditionals

These are contained in the figures *-random-conditionals-approx-base.*.

Variance diagnostic

The values for the variance diagnostic are stored in the field vals_var_diag in the file -Postprocess.dill.hdf5 in the HDF5 format. The accuracy of the variance diagnostic can be improved by calling tmap-postprocess again with a higher number of samples/higher order quadrature.

In [22]:
import h5py
f = h5py.File('Sequential-IntSq-Postprocess.dill.hdf5','r')
print( list(f.keys()) )
f.close()
['importance-samples', 'metropolis-independent-proposal-samples', 'quadrature', 'vals_var_diag']

Aligned marginals

We are contained in the figures *-aligned-marginals-approx-target.*.

Monte-Carlo quadrature

A Monte-Carlo quadrature with \(10^4\) points for the density \(T_\sharp \nu_\rho\) is stored in Sequential-IntSq-Postprocess.dill.hdf5.

In [23]:
import h5py
f = h5py.File('Sequential-IntSq-Postprocess.dill.hdf5','r')
x = f['quadrature']['approx-target']['0'][:,:]
f.close()

Importance samples

A set of \(10^4\) importance samples for \(\nu_\pi\) are generated using the biasing distribution \(T_\sharp \nu_\rho\).

In [24]:
import h5py
f = h5py.File('Sequential-IntSq-Postprocess.dill.hdf5','r')
x = f['importance-samples']['x']
w = f['importance-samples']['w']
f.close()

Metropolis-Hastings with independent proposals

A \(10^4\) long Markov chain with invariant distribution \(\nu_\pi\) is generated using \(T_\sharp \nu_\rho\) as proposal distribution.

In [25]:
import h5py
f = h5py.File('Sequential-IntSq-Postprocess.dill.hdf5','r')
mc = f['metropolis-independent-proposal-samples']['x']
f.close()

Additionally the Effective Sample Size (ESS) of the Markov chain is estimated using the \(0.99\) percentile accurate decay of the auto-correlation of its components. This analysis is shown in the figures *-metropolis-ess.* and the ESS is reported as an output of tmap-postprocess.

tmap-sequential-posprocess

This script provides a number of postprocessing routines for approximations of sequential Hidden Markov Chain distributions. In particular it deals with the characterization of the filtering distributions \(\nu_\pi\left({\bf Z}_k\middle\vert {\bf y}_{0:k}\right)\).

In [26]:
! tmap-sequential-postprocess

Usage: tmap-sequential-postprocess [-h -v -I]
  --data=DATA --output=OUTPUT
  [--store-fig-dir=DIR --store-fig-fmats=FMATS
   --extra-tit=TITLE --no-plotting
   --filtering-conditionals
     --filt-alc-n-points-x-ax=N --filt-alc-n-tri-plots=N
     --filt-alc-anchor=LIST --filt-alc-range=LIST
   --filtering-marginals
     --filt-alm-n-points=N --filt-alm-n-tri-plots=N
   --filtering-quadrature
     --filt-quad-qtype=QTYPE
     --filt-quad-qnum=QNUM
   --log=LOG --batch=BATCH --nprocs=NPROCS]

2017-06-18 16:52:19 ERROR: Option --data and --output must be specified
In [27]:
! tmap-sequential-postprocess -h

Usage: tmap-sequential-postprocess [-h -v -I]
  --data=DATA --output=OUTPUT
  [--store-fig-dir=DIR --store-fig-fmats=FMATS
   --extra-tit=TITLE --no-plotting
   --filtering-conditionals
     --filt-alc-n-points-x-ax=N --filt-alc-n-tri-plots=N
     --filt-alc-anchor=LIST --filt-alc-range=LIST
   --filtering-marginals
     --filt-alm-n-points=N --filt-alm-n-tri-plots=N
   --filtering-quadrature
     --filt-quad-qtype=QTYPE
     --filt-quad-qnum=QNUM
   --log=LOG --batch=BATCH --nprocs=NPROCS]

DESCRIPTION
Given a file (--data) storing the transport map pushing forward a base distribution
to a sequential Hidden Markov target distribution,
provides a number of postrprocessing routines.
All files involved are stored and loaded using the python package dill and
an extra file OUTPUT.hdf5 is created to store big datasets in the hdf5 format.
In the following default values are shown in brackets.

OPTIONS - input/output:
  --data=DATA           path to the file containing the target distribution,
                          the base distribution and the transport map pushing forward
                          the base to the target.
  --output=OUTPUT       path to the file storing all postprocess data.
                          The additional file OUTPUT.hdf5 will be used to store
                          the more memory consuming data.
  --store-fig-dir=DIR   path to the directory where to store the figures.
  --store-fig-fmats=FMATS  figure formats - see matplotlib for supported formats (svg)
  --extra-tit=TITLE     additional title for the figures' file names.
  --no-plotting         do not plot figures, but only store their data.
                          (requires --output or --store-fig-dir)
OPTIONS - Diagnostics:
  --filtering-conditionals  plot aligned slices of the filtering distribution:
                        Optional arguments:
    --filt-alc-n-points-x-ax=N  number of discretization points per axis (30)
    --filt-alc-n-tri-plots=N    number of subplots (0)
    --filt-alc-anchor=LIST      list of floats "f1,f2,f3..." for the anchor point (0)
    --filt-alc-range=LIST       list of two floats "f1,f2" for the range (-5,5)
OPTIONS - Sampling:
  --filtering-marginals  plot aligned marginals of the filtering distribution:
                        Optional arguments:
    --filt-alm-n-points=N     number of samples to be used for the kernel density estimation
    --filt-alm-n-tri-plots=N  number of subplots (0)
  --filtering-quadrature  generate quadrature of the filtering distribution:
                        Optional arguments:
    --filt-quad-qtype=QTYPE  generate quadrature of type QTYPE (0)
    --filt-quad-qnum=QNUM    level of the quadrature (int or list)
OPTIONS - Computation:
  --log=LOG               log level (default=30). Uses package logging.
                          Available options:
                            10: debug
                            20: info
                            30: warning
                            40: error
                            50: critical
  --nprocs=NPROCS         number of processors to be used (default=1)
  --batch=BATCH           list of batch sizes for function evaluation, gradient
                          evaluation and Hessian evaluation
OPTIONS - other:
  -v                      verbose output (not affecting --log)
  -I                      enter interactive mode after finishing
  -h                      print this help

In [28]:
! tmap-sequential-postprocess \
    --data=Sequential-IntSq.dill --output=Sequential-IntSq-Postprocess.dill -v \
    --store-fig-dir=./ --log=20 \
    --filtering-conditionals \
    --filtering-marginals --filt-alm-n-points=2000 \
    --filtering-quadrature --filt-quad-qtype=0 --filt-quad-qnum=10000
2017-06-18 16:52:25 [Start] Filtering conditionals
2017-06-18 16:52:25         Filtering conditionals - Step 0
2017-06-18 16:53:53         Filtering conditionals - Step 1
2017-06-18 16:55:24         Filtering conditionals - Step 2
2017-06-18 16:57:25         Filtering conditionals - Step 3
2017-06-18 16:59:42 [Stop]  Filtering conditionals
2017-06-18 16:59:42 [Start] Filtering marginals
2017-06-18 16:59:42         Filtering marginals - Step 0 - Sample generation
2017-06-18 16:59:42         Filtering marginals - Step 0 - Plotting
2017-06-18 16:59:46         Filtering marginals - Step 1 - Sample generation
2017-06-18 16:59:46         Filtering marginals - Step 1 - Plotting
2017-06-18 16:59:51         Filtering marginals - Step 2 - Sample generation
2017-06-18 16:59:51         Filtering marginals - Step 2 - Plotting
2017-06-18 16:59:56         Filtering marginals - Step 3 - Sample generation
2017-06-18 16:59:56         Filtering marginals - Step 3 - Plotting
2017-06-18 17:00:01 [Stop]  Filtering marginals
2017-06-18 17:00:01 [Start] Quadrature 0
2017-06-18 17:00:01         Quadrature 0- Step 0 - Sample generation
2017-06-18 17:00:01         Quadrature 0- Step 1 - Sample generation
2017-06-18 17:00:01         Quadrature 0- Step 2 - Sample generation
2017-06-18 17:00:01         Quadrature 0- Step 3 - Sample generation
2017-06-18 17:00:01 [Stop]  Quadrature
In [29]:
! ls
cli.ipynb
Direct-IntSq.dill
Distribution.dill
Laplace.dill
Sequential-IntSq-aligned-conditionals-approx-base.svg
Sequential-IntSq-aligned-marginals-approx-target.svg
Sequential-IntSq.dill
Sequential-IntSq-filtering-conditionals-0.svg
Sequential-IntSq-filtering-conditionals-1.svg
Sequential-IntSq-filtering-conditionals-2.svg
Sequential-IntSq-filtering-conditionals-3.svg
Sequential-IntSq-filtering-marginals-0.svg
Sequential-IntSq-filtering-marginals-1.svg
Sequential-IntSq-filtering-marginals-2.svg
Sequential-IntSq-filtering-marginals-3.svg
Sequential-IntSq-metropolis-ess.svg
Sequential-IntSq-Postprocess.dill
Sequential-IntSq-Postprocess.dill.hdf5
Sequential-IntSq-random-conditionals-approx-base.svg
xml

The command above procuced several outputs:

Aligned conditionals of filtering distributions

Step 0 Step 1
image0 image1
Step 2 Step 3
image2 image3

Aligned marginals of filtering distributions

Step 0 Step 1
image4 image5
Step 2 Step 3
image6 image7

Monte-Carlo samples of filtering distributions

In [30]:
import h5py
f = h5py.File('Sequential-IntSq-Postprocess.dill.hdf5','r')
x_list = []
for i in range(4):
    x_list.append( f['filtering']['step-%d' % i]['quadrature']['0'] )
f.close()