pith. sign in

arxiv: 1906.10613 · v1 · pith:B6V3REUKnew · submitted 2019-06-25 · 🧮 math.NA · cs.DC· cs.NA

Advances in Implementation, Theoretical Motivation, and Numerical Results for the Nested Iteration with Range Decomposition Algorithm

Pith reviewed 2026-05-25 16:30 UTC · model grok-4.3

classification 🧮 math.NA cs.DCcs.NA
keywords nested iterationrange decompositionelliptic PDEparallel computinglow communicationconvergence analysisnumerical methodsadaptive preprocessing
0
0 comments X

The pith

The nested iteration with range decomposition algorithm converges elliptic PDEs to high accuracy in one or two iterations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper advances the nested iteration with range decomposition algorithm for solving elliptic partial differential equations on high-performance parallel computers. It introduces improvements such as adaptive preprocessing, additional partitioning functions, and modified error measurement to handle more difficult problems. An updated convergence analysis rests on heuristic assumptions backed by numerical evidence, and a new performance model shows larger gains when traditional solvers are expensive. Testing across varied elliptic PDEs shows the method reaches target accuracy in a fixed small number of iterations.

Core claim

NIRD achieves excellent convergence on a wide class of elliptic PDEs within a small fixed number of iterations, usually one or two, after algorithmic enhancements including adaptivity and broader partitioning choices, with an updated proof based on heuristic assumptions supported by numerical evidence and a performance model that predicts greater benefits for complex problems.

What carries the argument

The nested iteration with range decomposition algorithm (NIRD), which decomposes the solution space to allow low-communication iterations that converge rapidly.

If this is right

  • NIRD requires only one or two iterations for high accuracy on many elliptic problems after the described updates.
  • The new performance model predicts increasing advantages over traditional methods as problems grow more expensive to solve.
  • The algorithm's reduced communication needs support efficient use on large parallel computers.
  • Adaptivity in preprocessing and wider partitioning choices extend reliable accuracy to more difficult elliptic PDEs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the heuristics continue to hold, the method could be applied to time-dependent or nonlinear problems where repeated solves are needed.
  • The low-communication property suggests testing NIRD on distributed systems with thousands of cores to measure actual speedup.
  • The performance model implies NIRD may reduce total cost in applications that solve elliptic PDEs many times, such as optimization loops.

Load-bearing premise

The fixed small number of iterations for convergence depends on heuristic assumptions about the range decomposition that are supported by numerics but not rigorously proven for every case.

What would settle it

Numerical tests on an elliptic PDE where the updated NIRD requires more than two iterations to reach the target accuracy would show the fixed-iteration property does not hold.

Figures

Figures reproduced from arXiv: 1906.10613 by Tom Manteuffel, Wayne Mitchell.

Figure 1
Figure 1. Figure 1: Comparison of NIRD convergence with and without an adaptive preprocessing step using 1,024 [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Visualization of example characteristic functions (top) and the resulting subproblem refinement [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Convergence of the LSF for the subproblems on each of 16 processors (note that many of the curves [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Plot of model predictions of number of communications vs. number of processors for NI vs. [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: NIRD convergence for Poisson with a smooth right-hand side using a discontinuous PoU. [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Compare NIRD convergence using different PoU’s for Poisson with a smooth right-hand side and [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Solutions and associated adaptive meshes for the advection-diffusion problem with positive [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: NIRD convergence for advection-diffusion with a smooth right-hand side and a boundary layer [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: An example subproblem mesh (left) and side-view of the solution (right) for a processor far from [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: NIRD convergence for jump-coefficient Poisson with a smooth right-hand side using a discon [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Compare NIRD convergence using different PoU’s for anisotropic Poisson with a smooth right [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Solution to the wave front problem used for testing higher-order elements. [PITH_FULL_IMAGE:figures/full_fig_p023_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: NIRD convergence using a discontinuous PoU, 256 processors, and varying the polynomial order [PITH_FULL_IMAGE:figures/full_fig_p024_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: NIRD subproblem convergence using a discontinuous PoU, 16 processors, and varying the [PITH_FULL_IMAGE:figures/full_fig_p025_14.png] view at source ↗
read the original abstract

This paper studies a low-communication algorithm for solving elliptic partial differential equations (PDE's) on high-performance machines, the nested iteration with range decomposition algorithm (NIRD). Previous work has shown that NIRD converges to a high level of accuracy within a small, fixed number of iterations (usually one or two) when applied to simple elliptic problems. This paper makes some improvements to the NIRD algorithm (including the addition of adaptivity during preprocessing, wider choice of partitioning functions, and modified error measurement) that enhance the method's accuracy and scalability, especially on more difficult problems. In addition, an updated convergence proof is presented based on heuristic assumptions that are supported by numerical evidence. Furthermore, a new performance model is developed that shows increased performance benefits for NIRD when problems are more expensive to solve using traditional methods. Finally, extensive testing on a variety of elliptic problems provides additional insight into the behavior of NIRD and additional evidence that NIRD achieves excellent convergence on a wide class of elliptic PDE's and, as such, should be a very competitive method for solving PDE's on large parallel computers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript describes improvements to the Nested Iteration with Range Decomposition (NIRD) algorithm for solving elliptic PDEs on parallel computers. These include adding adaptivity in preprocessing, expanding the choice of partitioning functions, modifying error measurement, presenting an updated convergence proof based on heuristic assumptions supported by numerical evidence, developing a new performance model, and providing extensive numerical tests on a variety of elliptic problems.

Significance. If the heuristic assumptions hold, the work would indicate that NIRD can achieve high accuracy in a fixed small number of iterations (typically 1-2) across a wide class of elliptic PDEs while reducing communication costs, making it potentially competitive for large-scale parallel computations; the performance model and numerical results offer practical insights into scalability advantages over traditional methods for more expensive problems.

major comments (1)
  1. [Abstract and convergence analysis] Abstract and convergence section: The central claim that NIRD converges to high accuracy in one or two iterations for a wide class of elliptic PDEs rests on an updated convergence proof that explicitly relies on heuristic assumptions verified only by the paper's own numerical tests, without providing rigorous derivation or error bounds independent of the examples. This is load-bearing because failure of any heuristic (e.g., on range decomposition or error propagation) would invalidate the fixed-iteration guarantee.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback on our manuscript. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract and convergence analysis] Abstract and convergence section: The central claim that NIRD converges to high accuracy in one or two iterations for a wide class of elliptic PDEs rests on an updated convergence proof that explicitly relies on heuristic assumptions verified only by the paper's own numerical tests, without providing rigorous derivation or error bounds independent of the examples. This is load-bearing because failure of any heuristic (e.g., on range decomposition or error propagation) would invalidate the fixed-iteration guarantee.

    Authors: We agree that the convergence analysis is heuristic rather than fully rigorous. The updated section explicitly frames the argument in terms of assumptions on range decomposition and error propagation, motivated by the algorithm structure, and validates them through numerical evidence on a variety of elliptic PDEs. A rigorous proof with example-independent bounds for general elliptic operators is an open problem and lies outside the scope of the present work. To address the concern about potential overstatement, we will revise the abstract and convergence section to state more explicitly that the fixed-iteration guarantee is supported by the heuristic analysis and extensive numerical validation, rather than a complete proof, and to discuss the tested scope of the assumptions. revision: yes

Circularity Check

1 steps flagged

Convergence proof rests on heuristic assumptions supported by the paper's own numerical evidence.

specific steps
  1. other [Abstract]
    "an updated convergence proof is presented based on heuristic assumptions that are supported by numerical evidence"

    The claim that NIRD converges to high accuracy in a fixed small number of iterations (usually one or two) for a wide class of elliptic PDEs is justified via a proof whose key assumptions are validated only by the paper's numerical tests, so the convergence 'prediction' reduces to evidence from the same runs rather than independent derivation.

full rationale

The abstract explicitly states that the updated convergence proof relies on heuristic assumptions supported by numerical evidence rather than rigorous bounds. This creates moderate circularity because the load-bearing claim (fixed small number of iterations for wide class of PDEs) depends on assumptions whose support comes from the same numerical tests used to demonstrate the method's behavior. No self-definitional, self-citation load-bearing, or other enumerated circular patterns are exhibited in the provided text; the algorithm improvements and performance model appear independent. This matches the expected moderate score for heuristic-supported claims without reducing the derivation to a fit by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on heuristic assumptions in the convergence proof and on the numerical evidence generated by the authors' own test suite; no free parameters are explicitly named in the abstract, but the performance model implicitly depends on cost ratios that are problem-dependent.

axioms (1)
  • domain assumption Heuristic assumptions underlying the convergence proof are valid for the tested elliptic problems
    Abstract states the proof is based on heuristic assumptions supported by numerical evidence

pith-pipeline@v0.9.0 · 5730 in / 1371 out tokens · 32800 ms · 2026-05-25T16:30:55.114248+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

  1. [1]

    A low-communication, parallel algorithm for solving PDEs based on range decomposition

    D J Appelhans and Thomas A Manteuffel. A low-communication, parallel algorithm for solving PDEs based on range decomposition. Numerical Linear Algebra with Applications , 2016

  2. [2]

    Modeling the performance of an algebraic multigrid cycle on HPC platforms

    Hormozd Gahvari, Allison H Baker, Martin Schulz, Ulrike Meier Yang, Kirk E Jordan, and William Gropp. Modeling the performance of an algebraic multigrid cycle on HPC platforms . ACM, New York, New York, USA, May 2011

  3. [3]

    A Survey of Parallelization Techniques for Multigrid Solvers

    E Chow, Robert D Falgout, Jonathan Joseph Hu, Raymond S Tuminaro, and U M Yang. A Survey of Parallelization Techniques for Multigrid Solvers. pages 1–23. August 2010. 26

  4. [4]

    Node Aware Sparse Matrix-Vector Multiplication

    Amanda Bienz, William D. Gropp, and Luke N. Olson. Tapspmv: Topology-aware parallel sparse matrix vector multiplication. CoRR, abs/1612.08060, 2016

  5. [5]

    An assumed partition algorithm for determining processor inter-communication

    Allison H Baker, Robert D Falgout, and U M Yang. An assumed partition algorithm for determining processor inter-communication. Parallel Computing, 32(5-6):394–414, 2006

  6. [6]

    Reducing Complexity in Parallel Algebraic Multigrid Preconditioners

    Hans De Sterck, Ulrike Meier Yang, and Jeffrey J Heys. Reducing Complexity in Parallel Algebraic Multigrid Preconditioners. SIAM Journal on Matrix Analysis and Applications , 27(4):1019–1039, Jan- uary 2006

  7. [7]

    Distance-two interpo- lation for parallel algebraic multigrid

    Hans De Sterck, Robert D Falgout, Joshua W Nolting, and Ulrike Meier Yang. Distance-two interpo- lation for parallel algebraic multigrid. Numerical Linear Algebra with Applications , 15(2-3):115–139, March 2008

  8. [8]

    Parallel adaptive multilevel methods with full domain partitions

    William F Mitchell. Parallel adaptive multilevel methods with full domain partitions. Applied Numer- ical Analysis & Computational . . . , 2004

  9. [9]

    A parallel multigrid method using the full domain partition

    William F Mitchell. A parallel multigrid method using the full domain partition. pages 1–10, November 2016

  10. [10]

    The full domain partition approach to distributing adaptive grids

    William F Mitchell. The full domain partition approach to distributing adaptive grids. Applied Nu- merical Mathematics, 26(1-2):265–275, January 1998

  11. [11]

    A new paradigm for parallel adaptive meshing algorithms

    Randolph E Bank and M Holst. A new paradigm for parallel adaptive meshing algorithms. SIAM Journal on Scientific Computing , 1999

  12. [12]

    A Domain Decomposition Solver for a Parallel Adaptive Meshing Paradigm

    Randolph E Bank and Shaoying Lu. A Domain Decomposition Solver for a Parallel Adaptive Meshing Paradigm. SIAM Journal on Scientific Computing , 26(1):105–127, July 2006

  13. [13]

    First-order system least squares for second-order partial differential equations: Part I

    Zhiqiang Cai, R Lazarov, Thomas A Manteuffel, and Stephen F McCormick. First-order system least squares for second-order partial differential equations: Part I. SIAM Journal on Numerical Analysis , 31(6):1785–1799, 1994

  14. [14]

    First-Order System Least Squares for Second-Order Partial Differential Equations: Part II

    Zhiqiang Cai, Thomas A Manteuffel, and Stephen F McCormick. First-Order System Least Squares for Second-Order Partial Differential Equations: Part II. SIAM Journal on Numerical Analysis, 34(2):425– 454, April 1997

  15. [15]

    Efficiency-based h- and hp-refinement strategies for finite element methods

    Hans De Sterck, Thomas A Manteuffel, Stephen F McCormick, J Nolting, J W Ruge, and L Tang. Efficiency-based h- and hp-refinement strategies for finite element methods. Numerical Linear Algebra with Applications, 15(2-3):89–114, 2008

  16. [16]

    Efficiency Based Adaptive Local Refinement for First-Order System Least-Squares Formulations.SIAM Journal on Scientific Computing , 33(1):1–24, January 2011

    J H Adler, Thomas A Manteuffel, Stephen F McCormick, J W Nolting, J W Ruge, and L Tang. Efficiency Based Adaptive Local Refinement for First-Order System Least-Squares Formulations.SIAM Journal on Scientific Computing , 33(1):1–24, January 2011

  17. [17]

    A Multigrid Tutorial, 2nd

    W F Briggs, V E Henson, and Stephen F McCormick. A Multigrid Tutorial, 2nd. Edition. SIAM, 2000

  18. [18]

    A two-dimensional interpolation function for irregularly-spaced data

    Donald Shepard. A two-dimensional interpolation function for irregularly-spaced data. In the 1968 23rd ACM national conference, pages 517–524, New York, New York, USA, 1968. ACM Press

  19. [19]

    Error Estimates for Adaptive Finite Element Computations

    I Babuvˇ ska and W C Rheinboldt. Error Estimates for Adaptive Finite Element Computations. SIAM Journal on Numerical Analysis , 15(4):736–754, 1978

  20. [20]

    Adaptive techniques in the finite element method

    J Z Zhu and O C Zienkiewicz. Adaptive techniques in the finite element method. International Journal for Numerical Methods in Biomedical Engineering , 4(2):197–204, March 1988

  21. [21]

    A collection of 2D elliptic problems for testing adaptive grid refinement algorithms

    William F Mitchell. A collection of 2D elliptic problems for testing adaptive grid refinement algorithms. Applied mathematics and computation , 220:350–364, September 2013. 27 P Cs ˜Cs Cρ ˆQ Cb ˜Cb 64 6.89 1.41 29.90 1.83E-04 2.78E+03 1.82 256 49.01 1.93 64.75 1.39E-06 1.18E+05 1.52 1024 258.10 2.05 73.41 2.87E-07 2.25E+06 2.09 Table 1: Measured constants ...