`TransportMaps.KL.minimize_KL_divergence`¶

Module Contents¶

Functions¶

`minimize_kl_divergence`(d1, d2[, qtype, qparams, x, w, ...])	Solve \(\arg \min_{\bf a}\mathcal{D}_{KL}\left(\pi, (T^\sharp\pi_{\rm tar})_{\bf a}\right)\)
`minimize_kl_divergence_objective`(a, params)	Objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.
`minimize_kl_divergence_grad_a_objective`(a, params)	Gradient of the objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.
`minimize_kl_divergence_tuple_grad_a_objective`(a, params)	Function evaluation and gradient of the objective \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.
`minimize_kl_divergence_hess_a_objective`(a, params)	Hessian of the objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.
`minimize_kl_divergence_action_hess_a_objective`(a, da, ...)	Action of the Hessian of the objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) on the direction `v`
`minimize_kl_divergence_action_storage_hess_a_objective`(a, ...)	Assemble the Hessian \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) and compute its action on the vector \(v\), for the KL-divergence minimization problem.
`minimize_kl_divergence_component`(f, x, w[, x0, ...])	Compute \({\bf a}^\star = \arg\min_{\bf a}-\sum_{i=0}^m \log\pi\circ T_k(x_i) + \log\partial_{x_k}T_k(x_i) = \arg\min_{\bf a}-\sum_{i=0}^m f(x_i)\)
`minimize_kl_divergence_component_objective`(a, params)	Objective function \(-\sum_{i=0}^m f(x_i) = -\sum_{i=0}^m \log\pi\circ T_k(x_i) + \log\partial_{x_k}T_k(x_i)\)
`minimize_kl_divergence_component_grad_a_objective`(a, ...)	Gradient of the objective function \(-\sum_{i=0}^m \nabla_{\bf a} f[{\bf a}](x_i) = -\sum_{i=0}^m \nabla_{\bf a} \left( \log\pi\circ T_k[{\bf a}](x_i) + \log\partial_{x_k}T_k[{\bf a}](x_i)\right)\)
`minimize_kl_divergence_component_hess_a_objective`(a, ...)	Hessian of the objective function \(-\sum_{i=0}^m \nabla^2_{\bf a} f[{\bf a}](x_i) = -\sum_{i=0}^m \nabla^2_{\bf a} \left( \log\pi\circ T_k[{\bf a}](x_i) + \log\partial_{x_k}T_k[{\bf a}](x_i)\right)\)
`minimize_kl_divergence_pointwise_monotone`(d1, d2[, x, ...])	Compute: \({\bf a}^* = \arg\min_{\bf a}\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\)
`minimize_kl_divergence_pointwise_monotone_constraints`(a, ...)
`minimize_kl_divergence_pointwise_monotone_da_constraints`(a, ...)
`minimize_kl_divergence_pointwise_monotone_component`(f, x, w)	Compute \({\bf a}^\star = \arg\min_{\bf a}-\sum_{i=0}^m \log\pi\circ T_k(x_i) + \log\partial_{x_k}T_k(x_i) = \arg\min_{\bf a}-\sum_{i=0}^m f(x_i)\)
`minimize_kl_divergence_pointwise_monotone_component_constraints`(a, ...)
`minimize_kl_divergence_pointwise_monotone_component_da_constraints`(a, ...)

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence(d1: TransportMaps.Distributions.Distribution, d2: TransportMaps.Distributions.ParametricTransportMapDistribution, qtype: int = None, qparams=None, x=None, w=None, params_d1=None, params_d2=None, x0=None, regularization=None, tol=0.0001, maxit=100, ders=2, fungrad=False, hessact=False, precomp_type='uni', batch_size=None, mpi_pool=None, grad_check=False, hess_check=False)[source]¶

Solve \(\arg \min_{\bf a}\mathcal{D}_{KL}\left(\pi, (T^\sharp\pi_{\rm tar})_{\bf a}\right)\)

Parameters:

d1 – sampling distribution
d2 – target distribution \(\pi_{\rm tar}\)
qtype (int) – quadrature type number provided by \(\pi\)
qparams (object) – inputs necessary to the generation of the selected quadrature
x (ndarray [\(m,d\)]) – quadrature points
w (ndarray [\(m\)]) – quadrature weights
params_d1 (dict) – parameters for the evaluation of \(\pi\)
params_d2 (dict) – parameters for the evaluation of \(\pi_{\rm tar}\)
x0 (ndarray [\(N\)]) – coefficients to be used as initial values for the optimization
regularization (dict) – defines the regularization to be used. If None, no regularization is applied. If key type=='L2' then applies Tikonhov regularization with coefficient in key alpha.
tol (float) – tolerance to be used to solve the KL-divergence problem.
maxit (int) – maximum number of iterations
ders (int) – order of derivatives available for the solution of the optimization problem. 0 -> derivative free, 1 -> gradient, 2 -> hessian.
fungrad (bool) – whether the target distribution provide the method Distribution.tuple_grad_x_log_pdf() computing the evaluation and the gradient in one step. This is used only for ders==1.
hessact (bool) – use the action of the Hessian. The target distribution must implement the function Distribution.action_hess_x_log_pdf().
precomp_type (str) – whether to precompute univariate Vandermonde matrices ‘uni’ or multivariate Vandermonde matrices ‘multi’
batch_size (list [3 or 2] of int) – the list contains the size of the batch to be used for each iteration. A size 1 correspond to a completely non-vectorized evaluation. A size None correspond to a completely vectorized one.
mpi_pool (mpi_map.MPI_Pool) – pool of processes
grad_check (bool) – whether to use finite difference to check the correctness of of the gradient
hess_check (bool) – whether to use finite difference to check the correctenss of the Hessian

Returns:

log informations from the solver

Return type:

log (dict)

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_objective(a, params)[source]¶

Objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.

Parameters:

a (ndarray [\(N\)]) – coefficients
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_grad_a_objective(a, params)[source]¶

Gradient of the objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.

Parameters:

a (ndarray [\(N\)]) – coefficients
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_tuple_grad_a_objective(a, params)[source]¶

Function evaluation and gradient of the objective \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.

Parameters:

a (ndarray [\(N\)]) – coefficients
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_hess_a_objective(a, params)[source]¶

Hessian of the objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) for the KL-divergence minimization.

Parameters:

a (ndarray [\(N\)]) – coefficients
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_action_hess_a_objective(a, da, params)[source]¶

Action of the Hessian of the objective function \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) on the direction v

Parameters:

a (ndarray [\(N\)]) – coefficients
da (ndarray [\(N\)]) – vector on which to apply the Hessian
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_action_storage_hess_a_objective(a, v, params)[source]¶

Assemble the Hessian \(\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\) and compute its action on the vector \(v\), for the KL-divergence minimization problem.

Parameters:

a (ndarray [\(N\)]) – coefficients
v (ndarray [\(N\)]) – vector on which to apply the Hessian
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_component(f: TransportMaps.Maps.Functionals.ProductDistributionParametricPullbackComponentFunction, x, w, x0=None, regularization=None, tol=0.0001, maxit=100, ders=2, fungrad=False, precomp_type='uni', batch_size=None, cache_level=1, mpi_pool=None)[source]¶

Compute \({\bf a}^\star = \arg\min_{\bf a}-\sum_{i=0}^m \log\pi\circ T_k(x_i) + \log\partial_{x_k}T_k(x_i) = \arg\min_{\bf a}-\sum_{i=0}^m f(x_i)\)

Parameters:

f – function \(f\)
x (ndarray [\(m,d\)]) – quadrature points
w (ndarray [\(m\)]) – quadrature weights
x0 (ndarray [\(N\)]) – coefficients to be used as initial values for the optimization
regularization (dict) – defines the regularization to be used. If None, no regularization is applied. If key type=='L2' then applies Tikonhov regularization with coefficient in key alpha.
tol (float) – tolerance to be used to solve the KL-divergence problem.
maxit (int) – maximum number of iterations
ders (int) – order of derivatives available for the solution of the optimization problem. 0 -> derivative free, 1 -> gradient, 2 -> hessian.
fungrad (bool) – whether the distributions \(\pi_1,\pi_2\) provide the method Distribution.tuple_grad_x_log_pdf() computing the evaluation and the gradient in one step. This is used only for ders==1.
precomp_type (str) – whether to precompute univariate Vandermonde matrices ‘uni’ or multivariate Vandermonde matrices ‘multi’
batch_size (list [3 or 2] of int or list of batch_size) – the list contains the size of the batch to be used for each iteration. A size 1 correspond to a completely non-vectorized evaluation. A size None correspond to a completely vectorized one. If the target distribution is a ProductDistribution, then the optimization problem decouples and batch_size is a list of lists containing the batch sizes to be used for each component of the map.
cache_level (int) – use high-level caching during the optimization, storing the function evaluation 0, and the gradient evaluation 1 or nothing -1
mpi_pool (mpi_map.MPI_Pool or list of mpi_pool) – pool of processes to be used, None stands for one process. If the target distribution is a ProductDistribution, then the minimization problem decouples and mpi_pool is a list containing ``mpi_pool``s for each component of the map.

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_component_objective(a, params)[source]¶

Objective function \(-\sum_{i=0}^m f(x_i) = -\sum_{i=0}^m \log\pi\circ T_k(x_i) + \log\partial_{x_k}T_k(x_i)\)

Parameters:

a (ndarray [\(N\)]) – coefficients
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_component_grad_a_objective(a, params)[source]¶

Gradient of the objective function \(-\sum_{i=0}^m \nabla_{\bf a} f[{\bf a}](x_i) = -\sum_{i=0}^m \nabla_{\bf a} \left( \log\pi\circ T_k[{\bf a}](x_i) + \log\partial_{x_k}T_k[{\bf a}](x_i)\right)\)

Parameters:

a (ndarray [\(N\)]) – coefficients
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_component_hess_a_objective(a, params)[source]¶

Hessian of the objective function \(-\sum_{i=0}^m \nabla^2_{\bf a} f[{\bf a}](x_i) = -\sum_{i=0}^m \nabla^2_{\bf a} \left( \log\pi\circ T_k[{\bf a}](x_i) + \log\partial_{x_k}T_k[{\bf a}](x_i)\right)\)

Parameters:

a (ndarray [\(N\)]) – coefficients
params (dict) – dictionary of parameters

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_pointwise_monotone(d1: TransportMaps.Distributions.Distribution, d2: TransportMaps.Distributions.ParametricTransportMapDistribution, x=None, w=None, params_d1=None, params_d2=None, x0=None, regularization=None, tol=0.0001, maxit=100, ders=1, fungrad=False, hessact=False, precomp_type='uni', batch_size=None, mpi_pool=None, grad_check=False, hess_check=False)[source]¶

Compute: \({\bf a}^* = \arg\min_{\bf a}\mathcal{D}_{KL}\left(\pi_1, \pi_{2,{\bf a}}\right)\)

Parameters:

d1 (Distribution) – distribution \(\pi_1\)
d2 (Distribution) – distribution \(\pi_2\)
x (ndarray [\(m,d\)]) – quadrature points
w (ndarray [\(m\)]) – quadrature weights
params_d1 (dict) – parameters for distribution \(\pi_1\)
params_d2 (dict) – parameters for distribution \(\pi_2\)
x0 (ndarray [\(N\)]) – coefficients to be used as initial values for the optimization
regularization (dict) – defines the regularization to be used. If None, no regularization is applied. If key type=='L2' then applies Tikonhov regularization with coefficient in key alpha.
tol (float) – tolerance to be used to solve the KL-divergence problem.
maxit (int) – maximum number of iterations
ders (int) – order of derivatives available for the solution of the optimization problem. 0 -> derivative free (SLSQP), 1 -> gradient (SLSQP).
fungrad (bool) – whether the target distribution provides the method Distribution.tuple_grad_x_log_pdf() computing the evaluation and the gradient in one step. This is used only for ders==1.
hessact (bool) – this option is disabled for linear span maps (no Hessian used)
precomp_type (str) – whether to precompute univariate Vandermonde matrices ‘uni’ or multivariate Vandermonde matrices ‘multi’
batch_size (list [2] of int) – the list contains the size of the batch to be used for each iteration. A size 1 correspond to a completely non-vectorized evaluation. A size None correspond to a completely vectorized one. If the target distribution is a ProductDistribution, then the optimization problem decouples and batch_size is a list of lists containing the batch sizes to be used for each component of the map.
mpi_pool (mpi_map.MPI_Pool or list of mpi_pool) – pool of processes to be used, None stands for one process. If the target distribution is a ProductDistribution, then the minimization problem decouples and mpi_pool is a list containing ``mpi_pool``s for each component of the map.
grad_check (bool) – whether to use finite difference to check the correctness of of the gradient
hess_check (bool) – whether to use finite difference to check the correctenss of the Hessian

Returns:

log informations from the solver

Return type:

log (dict)

Note

The parameters (qtype,qparams) and (x,w) are mutually exclusive, but one pair of them is necessary.

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_pointwise_monotone_constraints(a, params)[source]¶

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_pointwise_monotone_da_constraints(a, params)[source]¶

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_pointwise_monotone_component(f, x, w, x0=None, regularization=None, tol=0.0001, maxit=100, ders=2, fungrad=False, precomp_type='uni', batch_size=None, cache_level=1, mpi_pool=None)[source]¶

Compute \({\bf a}^\star = \arg\min_{\bf a}-\sum_{i=0}^m \log\pi\circ T_k(x_i) + \log\partial_{x_k}T_k(x_i) = \arg\min_{\bf a}-\sum_{i=0}^m f(x_i)\)

Parameters:

f (ProductDistributionParametricPullbackComponentFunction) – function \(f\)
x (ndarray [\(m,d\)]) – quadrature points
w (ndarray [\(m\)]) – quadrature weights
x0 (ndarray [\(N\)]) – coefficients to be used as initial values for the optimization
regularization (dict) – defines the regularization to be used. If None, no regularization is applied. If key type=='L2' then applies Tikonhov regularization with coefficient in key alpha.
tol (float) – tolerance to be used to solve the KL-divergence problem.
maxit (int) – maximum number of iterations
ders (int) – order of derivatives available for the solution of the optimization problem. 0 -> derivative free, 1 -> gradient, 2 -> hessian.
fungrad (bool) – whether the distributions \(\pi_1,\pi_2\) provide the method Distribution.tuple_grad_x_log_pdf() computing the evaluation and the gradient in one step. This is used only for ders==1.
precomp_type (str) – whether to precompute univariate Vandermonde matrices ‘uni’ or multivariate Vandermonde matrices ‘multi’
batch_size (list [3 or 2] of int or list of batch_size) – the list contains the size of the batch to be used for each iteration. A size 1 correspond to a completely non-vectorized evaluation. A size None correspond to a completely vectorized one. If the target distribution is a ProductDistribution, then the optimization problem decouples and batch_size is a list of lists containing the batch sizes to be used for each component of the map.
cache_level (int) – use high-level caching during the optimization, storing the function evaluation 0, and the gradient evaluation 1 or nothing -1
mpi_pool (mpi_map.MPI_Pool or list of mpi_pool) – pool of processes to be used, None stands for one process. If the target distribution is a ProductDistribution, then the minimization problem decouples and mpi_pool is a list containing ``mpi_pool``s for each component of the map.

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_pointwise_monotone_component_constraints(a, params)[source]¶

TransportMaps.KL.minimize_KL_divergence.minimize_kl_divergence_pointwise_monotone_component_da_constraints(a, params)[source]¶

TransportMaps.KL.minimize_KL_divergence¶

Module Contents¶

Functions¶

`TransportMaps.KL.minimize_KL_divergence`¶