The TransportMap support team will be happy to help you out with questions ranging from theory on transport maps, to the installation and usage of the software TransportMaps.

I get an MPI error for Inverse Transport.

0 votes

Hi there,

I am very much interested in using the parallel setup for the inverse transport computation. I have the most up-to-date installation of TM, mpi4py and mpi_map. I can also successfully run the Example 0 (direct transport) in the MPI tutorial:

# Example 0: Minimization of the KL-divergence and sampling
nprocs = 2
# Define target distribution
mu = 3.
beta = 4.
target_density = DIST.GumbelDistribution(mu,beta)
# Define base density
base_density = DIST.StandardNormalDistribution(1)
# Define approximating transport map
order = 5
tm_approx = TM.Default_IsotropicIntegratedExponentialTriangularTransportMap(1, order, 'full')
# Define approximating density
tm_density = DIST.PushForwardTransportMapDistribution(tm_approx, base_density)
# Start pool of processes
mpi_pool = TM.get_mpi_pool()
mpi_pool.start(nprocs)

# Solve and sample
try:
    qtype = 0      # Gauss quadrature
    qparams = 1000 # Quadrature order
    reg = None     # No regularization
    tol = 1e-8    # Optimization tolerance
    ders = 1       # Use gradient and Hessian
    log_entry_solve = tm_density.minimize_kl_divergence(
        target_density, qtype=qtype, qparams=qparams,
        regularization=reg, tol=tol, ders=ders,
        mpi_pool=mpi_pool) 
finally:
    mpi_pool.stop()
log_entry_solve
2019-03-20 13:21:46 WARNING:mpi_map: MPI_Pool_v2.alloc_dmem DEPRECATED since v>2.4. Use MPI_Pool_v2.bcast_dmem instead.
2019-03-20 13:21:46 WARNING:mpi_map: MPI_Pool_v2.alloc_dmem DEPRECATED since v>2.4. Use MPI_Pool_v2.bcast_dmem instead.
Out[16]:
{'success': True,
 'message': 'Optimization terminated successfully.',
 'fval': 1.4265539022688227,
 'nit': 40,
 'n_fun_ev': 41,
 'n_jac_ev': 41,
.......

Now when I try to replicate the same process for the inverse map estimation using the tutorial example on Gumbel distribution, I get an error:

import TransportMaps.Distributions as DIST


class GumbelDistribution(DIST.Distribution):
    def __init__(self, mu, beta):
        super(GumbelDistribution,self).__init__(1)
        self.mu = mu
        self.beta = beta
        self.dist = stats.gumbel_r(loc=mu, scale=beta)
    def pdf(self, x, params=None):
        return self.dist.pdf(x).flatten()
    def quadrature(self, qtype, qparams, *args, **kwargs):
        if qtype == 0: # Monte-Carlo
            x = self.dist.rvs(qparams)[:,np.newaxis]
            w = np.ones(qparams)/float(qparams)
        else: raise ValueError("Quadrature not defined")
        return (x, w)

mu = 3.
beta = 4.
pi = GumbelDistribution(mu,beta)
x, w = pi.quadrature(0, 5000)

# linear adjustment
xmax = np.max(x)
xmin = np.min(x)
a = np.array([ 4*(xmin+xmax)/(xmin-xmax) ])
b = np.array([ 8./(xmax-xmin) ])
L = MAPS.FrozenLinearDiagonalTransportMap(a,b)


S = TM.Default_IsotropicIntegratedSquaredTriangularTransportMap(
    1, 3, 'total')
rho = DIST.StandardNormalDistribution(1)
push_L_pi = DIST.PushForwardTransportMapDistribution(L, pi)
push_SL_pi = DIST.PushForwardTransportMapDistribution(
    S, push_L_pi)

# Start pool of processes
nprocs = 2
mpi_pool = TM.get_mpi_pool()
mpi_pool.start(nprocs)

# Solve and sample
try:
    qtype = 0      # Monte-Carlo quadratures from pi
    qparams = 500  # Number of MC points
    reg = None     # No regularization
    tol = 1e-3     # Optimization tolerance
    ders = 2       # Use gradient and Hessian
    log = push_SL_pi.minimize_kl_divergence(
        rho, qtype=qtype, qparams=qparams, regularization=reg,
        tol=tol, ders=ders,mpi_pool=mpi_pool)

finally:
    mpi_pool.stop()
    
SL = MAPS.CompositeMap(S,L)
pull_SL_rho = DIST.PullBackTransportMapDistribution(SL, rho)
log
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-15-d1561f88f5fc> in <module>
     54     log = push_SL_pi.minimize_kl_divergence(
     55         rho, qtype=qtype, qparams=qparams, regularization=reg,
---> 56         tol=tol, ders=ders,mpi_pool=mpi_pool)
     57 
     58 finally:

~/anaconda3/lib/python3.7/site-packages/TransportMaps/Distributions/TransportMapDistributions.py in minimize_kl_divergence(self, tar, qtype, qparams, parbase, partar, x0, regularization, tol, maxit, ders, fungrad, hessact, batch_size, mpi_pool, grad_check, hess_check)
    525             tol=tol, maxit=maxit, ders=ders, fungrad=fungrad, hessact=hessact,
    526             batch_size=batch_size,
--> 527             mpi_pool=mpi_pool, grad_check=grad_check, hess_check=hess_check)
    528         return log
    529 

~/anaconda3/lib/python3.7/site-packages/TransportMaps/Maps/TriangularTransportMapBase.py in minimize_kl_divergence(self, d1, d2, qtype, qparams, x, w, params_d1, params_d2, x0, regularization, tol, maxit, ders, fungrad, hessact, precomp_type, batch_size, mpi_pool, grad_check, hess_check)
   1018             print('mpi_pool_list is '+str(mpi_pool_list)) # added by Hassan
   1019             for i, (a, avars, batch_size, mpi_pool) in enumerate(zip(
-> 1020                     self.approx_list, self.active_vars, batch_size_list, mpi_pool_list)):
   1021                 f = ProductDistributionParametricPullbackComponentFunction(
   1022                     a, d2.base_distribution.get_component([i]) )

TypeError: zip argument #4 must support iteration

I print out the argument #4 and get

mpi_pool_list is <mpi_map.misc.MPI_Pool_v2 object at 0x7fa08444b8d0>.
I have tried this for my own data in 2D and 3D, and get the same error. 
I have studied the turorial on MPI, but still it is still beyond me to see where this comes from. 
Will appreciate if you have suggestions or thoughts.
 
 
 
asked Mar 20, 2019 in usage by HassanAr (13 points)
edited Mar 25, 2019 by HassanAr

1 Answer

+1 vote
 
Best answer

Hi Hassan, I think that the documentation is lacking some explanation here.

When solving the problem \(\min_T \mathcal{D}_{\text{KL}}\left( T_\sharp \pi \Vert \rho \right) \) , where \( \pi \) is your target from which you have samples and \( \rho \) is the Sandard Normal, the code will:

  1. flip the problem to solve \( \arg\min_T \mathcal{D}_{\text{KL}}\left( \pi \Vert T^\sharp \rho \right) = \arg\min_T \mathbb{E}_{\pi} \left[ -\log T^\sharp \rho \right] \approx \arg\min_T \sum_{i=1}^n -\log T^\sharp \rho({\bf x}_i) \). This is done in PushForwardTransportMapDistribution.minimize_kl_divergence .
  2. Then it will call MonotonicTriangularTransportMap.minimize_kl_divergence .
  3. If the distribution \(\rho\) is  a product distribution (as the Standard Normal is), the map will be learned componentwise (see Section 4.2 of Marzouk et al.)
Since usually the first components are not computationally expensive (few parameters), one can provide a list of \(d\) elements (\(d\) being the number of components of the map/dimension of the problem), where each element is either None or an mpi_pool. The point being that there is a tradeoff between using a single process or multiple processes: if one uses multiple processes gets the benefit of a faster function/gradient evaluations, at the expense of a fixed communication cost. This communication cost sometimes is higher than the actual functional/gradient evaluation costs (e.g. for the first components) and therefore it does not make sense to use parallelism for them.
 
I hope this clarifies it a little bit.
Daniele
answered Mar 21, 2019 by dabi (307 points)
selected Apr 1, 2019 by HassanAr

From your explanation I guessed that I have to pass 

mpi_pool=[None,None, ..., mpi_pool_i,mpi_pool_j]

to minimize_kl_divergence. Here is a 2D example:

BanDist = DIST.BananaDistribution(1, 1, np.zeros(2), np.array([[1., 0.9],[0.9, 1.]]))
x_,w_=BanDist.quadrature(0,500)

class DistributionFromSamples(DIST.Distribution):
    def __init__(self, x):
        super(DistributionFromSamples,self).__init__(np.size(x,1))
        self.x = x
    def quadrature(self, qtype, qparams, *args, **kwargs):
        if qtype == 0: # Monte-Carlo
            nmax = self.x.shape[0]
            if qparams> nmax:
                raise ValueError("Maximum sample size (%d) exceeded" % nmax)
            p = self.x[0:qparams,:]
            w = np.ones(qparams)/float(qparams) 
        else: raise ValueError("Quadrature not defined")
        return (p, w)
# create the instance for my sample    
pi2 = DistributionFromSamples(x_)
rho = DIST.StandardNormalDistribution(2)

order_= 4
S = TM.Default_IsotropicIntegratedSquaredTriangularTransportMap(2, order_, 'total')

mpi_pool2 = TM.get_mpi_pool()
mpi_pool2.start(2)

try:
    push_S_pi = DIST.PushForwardTransportMapDistribution(S,pi2)
    qtype = 0      # Monte-Carlo quadratures from pi
    qparams = 500  # Number of MC points = all available points
    reg = None     # No regularization
    tol = 1e-4     # Optimization tolerance
    ders = 2       # Use gradient and Hessian

    log = push_S_pi.minimize_kl_divergence(rho, qtype=qtype, qparams=qparams, 
                                            regularization=reg, tol=tol, ders=ders,maxit=300,mpi_pool=[None,mpi_pool2])
finally:
    mpi_pool2.stop()

and here is the output:

2019-03-28 15:17:24 INFO: TM.MonotonicIntegratedSquaredApproximation: Optimization terminated successfully
2019-03-28 15:17:24 INFO: TM.MonotonicIntegratedSquaredApproximation:   Function value:          1.411531
2019-03-28 15:17:24 INFO: TM.MonotonicIntegratedSquaredApproximation:   Norm of the Jacobian:    0.000002
2019-03-28 15:17:24 INFO: TM.MonotonicIntegratedSquaredApproximation:   Number of iterations:         4
2019-03-28 15:17:24 INFO: TM.MonotonicIntegratedSquaredApproximation:   N. function evaluations:      5
2019-03-28 15:17:24 INFO: TM.MonotonicIntegratedSquaredApproximation:   N. Jacobian evaluations:      8
2019-03-28 15:17:24 INFO: TM.MonotonicIntegratedSquaredApproximation:   N. Hessian evaluations:       4
2019-03-28 15:17:24 WARNING:mpi_map: MPI_Pool_v2.alloc_dmem DEPRECATED since v>2.4. Use MPI_Pool_v2.bcast_dmem instead.

The first component computation goes through but the second with the mpi_pool gets stuck and oesn't go beyond the warning (I terminated the longest try after 1 day). So I am suspecting that I have not understood the MPI syntax for inverse transport correctly. Is there a simple multidimensional example that I can use to compare and build up on?

Hi Hassan, I just tried to run your code on a fresh installation of TransportMaps v2.0b2 and everything seem to work. I get the warning but the optimization converges.

Can you try to activate the debugging mode and print me the output?

To do so you sould lower the log level of both TransportMaps and mpi_map to 10. The following changes should do:

mpi_pool2 = TM.get_mpi_pool() 
mpi_pool2.start(2) 
mpi_pool2.set_log_level(10) 
TM.setLogLevel(10) 

It should output a lot of stuff.

Thanks,

 Daniele

Hey Dabi,

Here is the output, it was stuck in the last line for 2 hrs before I report it here:

2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: set_log_level: broadcast [before]
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: bcast #1
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: bcast #2
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: bcast #3
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: bcast #4
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: bcast #5
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: set_log_level: broadcast [after]
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: set_lot_level: gather [before]
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: gather #1
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: gather #2
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: gather #3
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: gather #4
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: gather #5
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: gather #6
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: set_log_level: gather [after]
ha: mpi_pool_list is [None, <mpi_map.misc.MPI_Pool_v2 object at 0x7fc3f433d0b8>]
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: minimize_kl_divergence_component(): Precomputation started
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: minimize_kl_divergence(): Precomputation ended
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Obj. Eval. 1 - KL-divergence = 1.4239964923e+00
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 1 - ||grad_a KLdiv|| = 6.1173653638e-02
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Hess_a Obj. Eval. 1 
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Obj. Eval. 2 - KL-divergence = 1.4199478754e+00
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 2 - ||grad_a KLdiv|| = 2.5119115192e-02
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: Iteration 1
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 3 - ||grad_a KLdiv|| = 2.5119115192e-02
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Hess_a Obj. Eval. 2 
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Obj. Eval. 3 - KL-divergence = 1.4193971463e+00
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 4 - ||grad_a KLdiv|| = 2.8941057344e-03
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: Iteration 2
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 5 - ||grad_a KLdiv|| = 2.8941057344e-03
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Hess_a Obj. Eval. 3 
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Obj. Eval. 4 - KL-divergence = 1.4193770145e+00
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 6 - ||grad_a KLdiv|| = 4.4277864446e-06
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: Iteration 3
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 7 - ||grad_a KLdiv|| = 4.4277864446e-06
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Hess_a Obj. Eval. 4 
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Obj. Eval. 5 - KL-divergence = 1.4193770144e+00
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: KL Grad_a Obj. Eval. 8 - ||grad_a KLdiv|| = 1.2549354879e-11
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: Iteration 4
2019-03-29 14:00:38 INFO: TM.MonotonicIntegratedSquaredApproximation: Optimization terminated successfully
2019-03-29 14:00:38 INFO: TM.MonotonicIntegratedSquaredApproximation:   Function value:          1.419377
2019-03-29 14:00:38 INFO: TM.MonotonicIntegratedSquaredApproximation:   Norm of the Jacobian:    0.000004
2019-03-29 14:00:38 INFO: TM.MonotonicIntegratedSquaredApproximation:   Number of iterations:         4
2019-03-29 14:00:38 INFO: TM.MonotonicIntegratedSquaredApproximation:   N. function evaluations:      5
2019-03-29 14:00:38 INFO: TM.MonotonicIntegratedSquaredApproximation:   N. Jacobian evaluations:      8
2019-03-29 14:00:38 INFO: TM.MonotonicIntegratedSquaredApproximation:   N. Hessian evaluations:       4
2019-03-29 14:00:38 DEBUG: TM.MonotonicIntegratedSquaredApproximation: minimize_kl_divergence_component(): Precomputation started
2019-03-29 14:00:38 WARNING:mpi_map: MPI_Pool_v2.alloc_dmem DEPRECATED since v>2.4. Use MPI_Pool_v2.bcast_dmem instead.
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: alloc_dmem: barrier
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: alloc_dmem: bcast [before]
2019-03-29 14:00:38 DEBUG: mpi_map.MPI_Pool_v2: stop: barrier

 

Ok, I think there must be an exception that is not being catched and then the program jumps to the "finally" statement, calling for the mpi_pool to stop, but this call comes in the middle of another mpi command that is being sent (alloc_dmem), so things deadlock (this of course should not happen, and I will look into it later).

Instead of the simple try, finally statement, can you try to do:

try:
    ...
except Exception as e:
    print(e)
finally:
    mpi_pool2.stop()

This will show the exception being raised. My bet is that something is not being serialized, because the "alloc_dmem: bcast" log is just placed before the serialization of whatever needs to be allocated in the distributed memory.

It is strange that things seem to work on the direct map case though.

If you are able to print the exception, maybe we'll be able to figure out what is going on.

Thanks.

Hey Dabi, here is the last 4 lines with the except .... code, inluding the exception:

.....
2019-03-30 11:13:45 DEBUG: TM.MonotonicIntegratedSquaredApproximation: minimize_kl_divergence_component(): Precomputation started
2019-03-30 11:13:45 WARNING:mpi_map: MPI_Pool_v2.alloc_dmem DEPRECATED since v>2.4. Use MPI_Pool_v2.bcast_dmem instead.
2019-03-30 11:13:45 DEBUG: mpi_map.MPI_Pool_v2: alloc_dmem: barrier
2019-03-30 11:13:45 DEBUG: mpi_map.MPI_Pool_v2: alloc_dmem: bcast [before]
can't pickle mpi4py.MPI.Intercomm objects
2019-03-30 11:13:45 DEBUG: mpi_map.MPI_Pool_v2: stop: barrier

PS: I get the same output when I use a fresh installation on Mac.

Yes, it is trying to serialize something that shouldn't be serialized... it is strange that this error does not pop up on my machine. Can I ask you which version of TM, mpi_map, python and OS you are using.

Meanwhile I will roll out a fix to this issue.

Hi Hassan, I just rolled out a fix in v2.0b3

Try to

pip install --upgrade TransportMaps

Hopefully it fixes the issue (and does not introduce any new bug). Anyway the issue has also been addressed on v3.0 that is to be released soon. Thanks for pointing it out.

Yay! It works now, thank you!

Here are the versions prior to upgrading TM:

TransportMap 2.0b2, mpi4py 3.0.1, mpi_map 2.5, python 3.7.1 on Ubuntu 16.04
...