pith. sign in

arxiv: 1406.1974 · v1 · pith:6A6XOEWXnew · submitted 2014-06-08 · 💻 cs.DC · cs.NA

Communication Complexity of the Fast Multipole Method and its Algebraic Variants

classification 💻 cs.DC cs.NA
keywords fastmultipolehierarchicalmethodscommunicationcomplexitydataoperators
0
0 comments X
read the original abstract

A combination of hierarchical tree-like data structures and data access patterns from fast multipole methods and hierarchical low-rank approximation of linear operators from H-matrix methods appears to form an algorithmic path forward for efficient implementation of many linear algebraic operations of scientific computing at the exascale. The combination provides asymptotically optimal computational and communication complexity and applicability to large classes of operators that commonly arise in scientific computing applications. A convergence of the mathematical theories of the fast multipole and H-matrix methods has been underway for over a decade. We recap this mathematical unification and describe implementation aspects of a hybrid of these two compelling hierarchical algorithms on hierarchical distributed-shared memory architectures, which are likely to be the first to reach the exascale. We present a new communication complexity estimate for fast multipole methods on such architectures. We also show how the data structures and access patterns of H-matrices for low-rank operators map onto those of fast multipole, leading to an algebraically generalized form of fast multipole that compromises none of its architecturally ideal properties.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A Simple Communication Scheme for Distributed Fast Multipole Methods

    cs.DC 2026-04 unverdicted novelty 4.0

    A simple MPI-based scheme for distributed uniform-tree FMMs achieves weak scaling to 3.2e10 points on 512 nodes while preserving shared-memory optimizations.