Parallelization of the inverse fast multipole method with an application to boundary element method

Chao Chen; Eric Darve; Toru Takahashi

read the original abstract

We present an algorithm to parallelize the inverse fast multipole method (IFMM), which is an approximate direct solver for dense linear systems. The parallel scheme is based on a greedy coloring algorithm, where two nodes in the hierarchy with the same color are separated by at least $\sigma$ nodes. We proved that when $\sigma \ge 6$, the workload associated with one color is embarrassingly parallel. However, the number of nodes in a group (color) may be small when $\sigma = 6$. Therefore, we also explored $\sigma = 3$, where a small fraction of the algorithm needs to be serialized, and the overall parallel efficiency was improved. We implemented the parallel IFMM using OpenMP for shared-memory machines. Successively, we applied it to a fast-multipole accelerated boundary element method (FMBEM) as a preconditioner, and compared its efficiency with (a) the original IFMM parallelized by linking a multi-threaded linear algebra library and (b) the commonly used parallel block-diagonal preconditioner. Our results showed that our parallel IFMM achieved at most $4\times$ and $11\times$ speedups over the reference method (a) and (b), respectively, in realistic examples involving more than one million variables.

Parallelization of the inverse fast multipole method with an application to boundary element method

discussion (0)