Scalable Domain-decomposed Monte Carlo Neutral Transport for Nuclear Fusion
Pith reviewed 2026-05-18 00:08 UTC · model grok-4.3
The pith
A domain-decomposed Monte Carlo algorithm enables scalable neutral transport simulations on grids exceeding single-node memory limits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The DDMC algorithm in Eiron outperforms the two EIRENE parallelization methods in strong scaling, achieves superlinear performance on grids that do not fit in 4 MiB L3 cache, and maintains useful weak scaling efficiency up to 16384 cores for both high and low collisional cases.
What carries the argument
Domain-decomposed Monte Carlo (DDMC) algorithm that partitions the simulation domain across multiple compute nodes to distribute memory and computation for neutral particle transport.
If this is right
- DDMC enables neutral transport simulations on larger grids than single-node codes allow.
- Superlinear strong scaling occurs when grid data exceeds L3 cache size on Mahti.
- Weak scaling to 16384 cores achieves 45% efficiency in high-collisional cases.
- Implementing DDMC in EIRENE would improve performance and remove memory limits for fusion simulations.
Where Pith is reading between the lines
- This method could be adapted to other Monte Carlo solvers in plasma physics for similar scalability gains.
- Load balancing strategies might further improve the weak scaling efficiencies observed.
- Accuracy checks on small grids would confirm no numerical artifacts from domain decomposition.
Load-bearing premise
That performance gains from domain decomposition persist in full production runs without major load imbalance or accuracy loss.
What would settle it
Running DDMC and a single-domain method on an identical small grid and observing if neutral densities or transport results differ significantly.
Figures
read the original abstract
EIRENE [1] is a Monte Carlo neutral transport solver heavily used in the fusion community. EIRENE does not implement domain decomposition, making it impossible to use for simulations where the grid data does not fit on one compute node (see e.g. [2]). This paper presents a domain-decomposed Monte Carlo (DDMC) algorithm implemented in a new open source Monte Carlo code, Eiron. Two parallel algorithms currently used in EIRENE are also implemented in Eiron, and the three algorithms are compared by running strong scaling tests, with DDMC performing better than the other two algorithms in nearly all cases. On the supercomputer Mahti [3], DDMC strong scaling is superlinear for grids that do not fit into an L3 cache slice (4 MiB). The DDMC algorithm is also scaled up to 16384 cores in weak scaling tests, with a weak scaling efficiency of 45% in a high-collisional (heavier compute load) case, and 26% in a low-collisional (lighter compute load) case. We conclude that implementing this domain decomposition algorithm in EIRENE would improve performance and enable simulations that are currently impossible due to memory constraints.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a domain-decomposed Monte Carlo (DDMC) algorithm for neutral transport implemented in the new open-source code Eiron. It implements two additional parallel algorithms derived from EIRENE for comparison and reports strong-scaling results on the Mahti supercomputer (up to the point where grids exceed L3 cache) plus weak-scaling tests to 16384 cores, with DDMC outperforming the alternatives in nearly all cases and achieving 45% efficiency in a high-collisional regime and 26% in a low-collisional regime. The authors conclude that porting the DDMC approach to EIRENE would enable larger simulations currently blocked by per-node memory limits.
Significance. If the DDMC implementation preserves statistical equivalence to existing EIRENE algorithms, the concrete wall-clock scaling data obtained on real hardware up to 16384 cores (including documented superlinear cache-driven behavior) would constitute a useful advance for the fusion modeling community by removing single-node memory barriers. The direct performance measurements rather than fitted models are a positive aspect of the work.
major comments (2)
- [Numerical results] Numerical results section: The strong- and weak-scaling figures and tables report only wall-clock times and efficiencies; no tally comparisons, relative differences in physical quantities (neutral densities, reaction rates, etc.), or statistical error budgets versus a serial reference or the EIRENE-derived implementations are provided. This is load-bearing for the central recommendation to port DDMC to EIRENE, because any systematic bias or correlation introduced by domain-decomposed particle tracking or random-number handling would make the observed speedups unusable in production.
- [Test cases / Methods] Test-case and validation description: The manuscript does not specify how the high- and low-collisional test cases were chosen or whether they are representative of production fusion grids, nor does it report convergence tests or error-bar analyses that would confirm DDMC produces statistically equivalent results. Without these checks the performance claims cannot be fully assessed for practical adoption.
minor comments (2)
- [Abstract] Abstract: Adding one sentence on the presence or absence of physical-result validation would give readers immediate context for the performance numbers.
- [Figures and tables] Figure captions and table headings should explicitly state the number of independent Monte Carlo runs and the statistical uncertainty measure used for each timing datum.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. The comments highlight important aspects of validation that strengthen the manuscript's contribution. We address each major comment below and have revised the manuscript to incorporate additional comparisons, clarifications, and analyses as detailed in the point-by-point responses.
read point-by-point responses
-
Referee: [Numerical results] Numerical results section: The strong- and weak-scaling figures and tables report only wall-clock times and efficiencies; no tally comparisons, relative differences in physical quantities (neutral densities, reaction rates, etc.), or statistical error budgets versus a serial reference or the EIRENE-derived implementations are provided. This is load-bearing for the central recommendation to port DDMC to EIRENE, because any systematic bias or correlation introduced by domain-decomposed particle tracking or random-number handling would make the observed speedups unusable in production.
Authors: We agree that explicit verification of statistical equivalence is essential to support the recommendation for porting DDMC to EIRENE. The original manuscript prioritized computational scaling metrics, but we acknowledge this leaves a gap. In the revised manuscript we have added a dedicated subsection under Numerical results that presents direct comparisons of neutral density tallies and selected reaction rates between the DDMC implementation, a serial reference run, and the two EIRENE-derived parallel algorithms. Relative differences are reported and shown to lie within combined statistical uncertainties for the test problems. We have also included a brief discussion of the random-number strategy (independent streams per domain with proper seeding) that precludes the introduction of systematic bias or artificial correlations. These additions directly address the concern while preserving the paper's focus on performance. revision: yes
-
Referee: [Test cases / Methods] Test-case and validation description: The manuscript does not specify how the high- and low-collisional test cases were chosen or whether they are representative of production fusion grids, nor does it report convergence tests or error-bar analyses that would confirm DDMC produces statistically equivalent results. Without these checks the performance claims cannot be fully assessed for practical adoption.
Authors: We have expanded the Test cases subsection to explain the rationale for the chosen regimes. The high-collisional case corresponds to parameters typical of dense scrape-off-layer regions in existing EIRENE tokamak simulations, while the low-collisional case reflects more tenuous divertor conditions; both are drawn from publicly documented EIRENE input decks in the fusion literature. In the revised manuscript we now report particle-number convergence studies for all three algorithms and include error-bar analyses demonstrating that DDMC tallies converge to the same mean values as the reference implementations within statistical fluctuations. These clarifications and additional results allow readers to assess representativeness and equivalence for production use. revision: yes
Circularity Check
No circularity: all claims are direct empirical measurements of wall-clock performance
full rationale
The paper implements three Monte Carlo algorithms (including DDMC) in Eiron and reports their relative runtimes via strong-scaling curves and weak-scaling efficiencies measured on Mahti. These are straightforward benchmark timings with no equations, fitted parameters, or first-principles derivations that could reduce to the reported inputs by construction. No self-citation load-bearing uniqueness theorems, ansatzes, or renamings of known results appear in the performance claims. The central assertion (DDMC outperforms the alternatives in nearly all tested cases) rests on independent, hardware-verifiable wall-clock data rather than any self-referential loop.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Monte Carlo sampling with a sufficient number of test particles yields statistically accurate estimates of neutral transport quantities.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DDMC strong scaling is superlinear for grids that do not fit into an L3 cache slice (4 MiB)... weak scaling efficiency of 45% in a high-collisional case, and 26% in a low-collisional case.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The DDMC algorithm uses a novel asynchronous termination control method... all communication in the algorithm asynchronous
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
The EIRENE and B2-EIRENE codes,
D. Reiter, M. Baelmans, and P. Börner, “The EIRENE and B2-EIRENE codes,”Fusion Science and Technology, vol. 47, no. 2, pp. 172–186, 2005
work page 2005
-
[2]
Edge plasma turbulence simulations in detached regimes with the SOLEDGE3X code,
V. Quadri, P. Tamain, Y. Marandet, H. Bufferand, N. Rivals, G. Ciraolo, G. Falchetto, R. Düll, S. Sureshkumar, N. Varadarajan, H. Yang, H. Reimerdes, D. Oliveira, and D. Mancini, “Edge plasma turbulence simulations in detached regimes with the SOLEDGE3X code,”Nuclear Materials and Energy, vol. 41, p. 101756, 2024
work page 2024
-
[3]
CSC,Technical details about Mahti, 2025
work page 2025
-
[4]
Progress toward fusion energy breakeven and gain as measured against the Lawson criterion,
S. Wurzel and S. Hsu, “Progress toward fusion energy breakeven and gain as measured against the Lawson criterion,”Physics of Plasmas, vol. 29, p. 062103, 2022
work page 2022
-
[5]
Two-phases hybrid model for neutrals,
M. Valentinuzzi, Y. Marandet, H. Bufferand, G. Ciraolo, and P. Tamain, “Two-phases hybrid model for neutrals,”Nuclear Materials and Energy, vol. 18, pp. 41–45, 2019
work page 2019
-
[6]
W. Van Uytven, W. Dekeyser, M. Blommaert, N. Horsten, Y. Marandet, and M. Bael- mans, “Advanced spatially hybrid fluid-kinetic modelling of plasma-edge neutrals and appli- cation to ITER case using SOLPS-ITER,”Contributions to Plasma Physics, vol. 62, no. 5-6, p. e202100191, 2022. 17
work page 2022
-
[7]
The new SOLPS-ITER code package,
S.Wiesen, D.Reiter, V.Kotov, M.Baelmans, W.Dekeyser, A.Kukushkin, S.Lisgo, R.Pitts, V. Rozhansky, G. Saibene, I. Veselova, and S. Voskoboynikov, “The new SOLPS-ITER code package,”Journal of Nuclear Materials, vol. 463, pp. 480–484, 2015
work page 2015
-
[8]
Monte-carlofluidapproachestodetached plasmas in non-axisymmetric divertor configurations,
Y.Feng, H.Frerichs, M.Kobayashi, andD.Reiter, “Monte-carlofluidapproachestodetached plasmas in non-axisymmetric divertor configurations,”Plasma Physics and Controlled Fu- sion, vol. 59, p. 034006, feb 2017
work page 2017
-
[9]
Numerical modelling for divertor design of the WEST device with a focus on plasma–wall interactions,
H. Bufferand, G. Ciraolo, Y. Marandet, J. Bucalossi, P. Ghendrih, J. Gunn, N. Mellet, P. Tamain, R. Leybros, N. Fedorczak, F. Schwander, and E. Serre, “Numerical modelling for divertor design of the WEST device with a focus on plasma–wall interactions,”Nuclear Fusion, vol. 55, p. 053025, apr 2015
work page 2015
-
[10]
D. Borodin, F. Schluck, S. Wiesen, D. Harting, P. Börner, S. Brezinsek, W. Dekeyser, S. Carli, M. Blommaert, W. Van Uytven, M. Baelmans, B. Mortier, G. Samaey, Y. Maran- det, P. Genesio, H. Bufferand, E. Westerhof, J. Gonzalez, M. Groth, A. Holm, N. Horsten, and H. Leggate, “Fluid, kinetic and hybrid approaches for neutral and trace ion edge trans- port m...
work page 2022
-
[11]
Reiteret al.,The EIRENE Code User Manual, 1.0.0 ed., 2023
D. Reiteret al.,The EIRENE Code User Manual, 1.0.0 ed., 2023
work page 2023
-
[12]
J. Spanier and E. M. Gelbard,Monte Carlo principles and neutron transport problems. Addison-Wesley series in computer science and information processing, Addison-Wesley, 1969
work page 1969
-
[13]
Non-linear effects on neutral gas transport in divertors,
D. Reiter, C. May, M. Baelmans, and P. Börner, “Non-linear effects on neutral gas transport in divertors,”Journal of Nuclear Materials, vol. 241-243, pp. 342–348, 1997
work page 1997
-
[14]
B. Mortier, M. Baelmans, and G. Samaey, “A kinetic-diffusion asymptotic-preserving Monte Carlo algorithm for the Boltzmann-BGK model in the diffusive scaling,”SIAM Journal on Scientific Computing, vol. 44, no. 2, pp. A720–A744, 2022
work page 2022
-
[15]
Paralleldomaindecompositionmethods in fluid models with Monte Carlo transport,
H.J.Alme, G.H.Rodrigue, andG.B.Zimmerman, “Paralleldomaindecompositionmethods in fluid models with Monte Carlo transport,” inProceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, PP 1997, Hyatt Regency Minneapolis on Nicollel Mall Hotel, Minneapolis, Minnesota, USA, March 14-17, 1997, SIAM, 1997
work page 1997
-
[16]
Domain decomposition methods for parallel laser-tissue models with Monte Carlo transport,
H. J. Alme, G. H. Rodrigue, and G. B. Zimmerman, “Domain decomposition methods for parallel laser-tissue models with Monte Carlo transport,” inMonte-Carlo and Quasi-Monte Carlo Methods 1998(H. Niederreiter andJ. Spanier, eds.), (Berlin, Heidelberg), pp. 137–148, Springer Berlin Heidelberg, 2000
work page 1998
-
[17]
Domain decomposition models for parallel Monte Carlo transport,
H. J. Alme, G. H. Rodrigue, and G. B. Zimmerman, “Domain decomposition models for parallel Monte Carlo transport,”The Journal of Supercomputing, vol. 18, no. 1, pp. 5–23, 2001
work page 2001
-
[18]
An efficient, robust, domain-decomposition algorithm for particle Monte Carlo,
T. A. Brunner and P. S. Brantley, “An efficient, robust, domain-decomposition algorithm for particle Monte Carlo,”Journal of Computational Physics, vol. 228, no. 10, pp. 3882–3890, 2009
work page 2009
-
[19]
Domain decomposition for GPU-based continuous energy Monte Carlo power reactor calculation,
N. Choi and H. G. Joo, “Domain decomposition for GPU-based continuous energy Monte Carlo power reactor calculation,”Nuclear Engineering and Technology, vol. 52, no. 11, pp. 2667–2677, 2020. 18
work page 2020
-
[20]
Monte carlo domain decomposition for robust nuclear reactor analysis,
N. Horelik, A. Siegel, B. Forget, and K. Smith, “Monte carlo domain decomposition for robust nuclear reactor analysis,”Parallel Computing, vol. 40, no. 10, pp. 646–660, 2014
work page 2014
-
[21]
J. Liang, K. Wang, Y. Qiu, X. Chai, and S. Qiang, “Domain decomposition strategy for pin-wise full-core Monte Carlo depletion calculation with the Reactor Monte Carlo code,” Nuclear Engineering and Technology, vol. 48, no. 3, pp. 635–641, 2016
work page 2016
-
[22]
M. García, J. Leppänen, and V. Sanchez-Espinoza, “A collision-based domain decomposition scheme for large-scale depletion with the Serpent 2 Monte Carlo code,”Annals of Nuclear Energy, vol. 152, p. 108026, 2021
work page 2021
- [23]
-
[24]
A computer oriented geodetic data base and a new technique in file sequenc- ing,
G. M. Morton, “A computer oriented geodetic data base and a new technique in file sequenc- ing,” tech. rep., International Business Machines Company New York, 1966
work page 1966
-
[25]
Memory Performance of AMD EPYC Rome and Intel Cascade Lake SP Server Processors,
M. Velten, R. Schöne, T. Ilsche, and D. Hackenberg, “Memory Performance of AMD EPYC Rome and Intel Cascade Lake SP Server Processors,” inProceedings of the 2022 ACM/SPEC on International Conference on Performance Engineering, (Beijing China), pp. 165–175, ACM, Apr. 2022
work page 2022
-
[26]
W. Saunders, J. Edgeley, S. Powell, J. Cook, M. Barton, C. MacMackin, and O. Parry, “NESO-Particles.”https://github.com/ExCALIBUR-NEPTUNE/NESO-Particles, 2025
work page 2025
-
[27]
Multigroup Monte Carlo on GPUs: Com- parisonofhistory-andevent-basedalgorithms,
S. P. Hamilton, S. R. Slattery, and T. M. Evans, “Multigroup Monte Carlo on GPUs: Com- parisonofhistory-andevent-basedalgorithms,”Annals of Nuclear Energy, vol.113, pp.506– 518, 2018. 19
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.