pith. sign in

arxiv: 2606.17145 · v1 · pith:JT4366MYnew · submitted 2026-06-15 · 🌌 astro-ph.IM · astro-ph.CO· astro-ph.GA· cs.PF

OpenGadget3 GPU solver tests

Pith reviewed 2026-06-27 02:45 UTC · model grok-4.3

classification 🌌 astro-ph.IM astro-ph.COastro-ph.GAcs.PF
keywords GPU portingcosmological simulationsN-body hydrodynamicsOpenGadget3accuracy validationperformance scalingshock tube testzoom-in simulations
0
0 comments X

The pith

The GPU port of OpenGadget3 matches CPU results across gravity-only, hydro, and full-physics cosmological tests.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests the accuracy of GPU versions of the short-range gravity integrator, hydrodynamic solver parts, and conjugate gradient conduction solver in OpenGadget3. It runs a sequence of simulations that add physical processes one at a time: a gravity-only cosmological box, a shock-tube hydro test, a non-radiative cluster zoom-in, and a full-physics galaxy zoom-in. In each case the GPU output is compared directly to the original CPU code. The two agree closely at all scales that matter for science, with only tiny differences appearing at the smallest resolved scales.

Core claim

Comparing the results obtained with the GPU implementation to those from the classical CPU version, we find excellent agreement across all tests, with small differences on very small scales.

What carries the argument

The GPU port of the short-range gravity integrator, hydrodynamic solver components, and conjugate gradient solver for thermal conduction.

If this is right

  • The GPU code can replace the CPU version for the tested physics without changing scientific conclusions.
  • Individual modules run 3-5 times faster on a GPU chip than on a CPU chip.
  • Full cosmological setups with many processes and overheads still achieve 2-3 times overall speedup on the same hardware.
  • The port is ready for use on the four tested supercomputer architectures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar GPU ports of other mesh or particle codes could be validated with the same staged test progression.
  • The small-scale differences are unlikely to affect statistics measured on halo or galaxy scales.
  • Future work could add radiation or magnetic modules and repeat the same comparison sequence.

Load-bearing premise

The chosen tests exercise every ported module under conditions that represent production cosmological runs.

What would settle it

A new test that includes all the ported modules and shows statistically significant differences between GPU and CPU results at resolved scales would falsify the agreement claim.

Figures

Figures reproduced from arXiv: 2606.17145 by A. Ragagnin, F. Groth, G. S. Karademir, K. Dolag, L. M. B\"oss, L. Tornatore, M. Aiello, N. Hariharan, T. Castro.

Figure 1
Figure 1. Figure 1: Flowchart summarising the GPU porting paradigm as implemented [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 1
Figure 1. Figure 1: The CPU scheme for the drift of particles is adaptive, where active particles are drifted at the beginning of the timestep, while the others are drifted when encountered during a tree walk. The latter type of drift requires a tread-locking operation (namely, to use #pragma omp critical OpenMP regions) in order to avoid a data race during the position drift. Since this approach would be problematic on GPU, … view at source ↗
Figure 2
Figure 2. Figure 2: Comparison between CPU and GPU runs of the DMO cosmological [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of shock tube solutions. Each row reports the following [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Non-radiative galaxy simulation density 2D map (top panel), and [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Non-radiative galaxy simulation. Left panel: gas matter density; middle panel: SPH density (namely, average SPH [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Time to solution versus scale factor for the Magneticum dark matter [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: OpenGadget3 chip-to-chip scaling on MareNostrumV-ACC. We used 2 × 2563 , 2 × 5123 , and 10243 particles, respectively. Both setups used 20 OpenMP threads per MPI rank. The bottom axis shows the number of cores used in both CPU and GPU runs, while the top axis shows the number of GPUs used in the GPU run. The simulations consist of 100 steps of gravity-only time integration from a very early and homogeneous… view at source ↗
Figure 9
Figure 9. Figure 9: Scaling relations for OpenGadget3 on SuperMUC-NG2 using only CPUs or adding GPU accelerators. The top panel shows the strong scaling relation of both implementations, while the lower panel shows the weak scaling results for different test cases. GPUs, demonstrating consistent scaling behaviour, maintaining over 80% efficiency up to a 32× resource increase. The top panel of [PITH_FULL_IMAGE:figures/full_fi… view at source ↗
Figure 10
Figure 10. Figure 10: Nsight profiling of selected timesteps. Left panel: a large timebin; right panel: a smaller timebin. The upper part of the profiling shows the CUDA GPU [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗
read the original abstract

We present an in-depth evaluation of the scalability and accuracy of the GPU porting of the N-body code for hydrodynamic cosmological simulations \og. While technical details of our GPU porting were presented in Ragagnin et al. (2020), in this work we focus on assessing the accuracy of the ported modules: the short range gravity integrator, the different components of the hydrodynamic solver, and the conjugate gradient solver for thermal conduction. We ran several tests that gradually increase the number of physical modules included: a gravity-only cosmological simulation; a hydrodynamical shock tube test; a non-radiative zoom-in simulation of a galaxy cluster in a cosmological box; and a full-physics zoom-in simulation of a galaxy in a cosmological box. Comparing the results obtained with the GPU implementation to those from the classical CPU version, we find excellent agreement across all tests, with small differences on very small scales. For the individual physical modules, we find a GPU chip-to-chip speedup ranging from $\approx3-5$. For more complex cosmological and hydrodynamical setups, where a large number of physical processes and overheads contribute to the total workload, the observed total chip-to-chip speedup (with the same number of nodes and CPUs per node) is $\approx2-3$. We ran our tests on four different supercomputers: Leonardo Booster (CINECA), MareNostrum-V (BSC), SuperMUC-NG2 (LRZ), and the CIP cluster of the Faculty of Physics at the Ludwig-Maximilians-Universit\"at (LMU).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper evaluates the GPU port of OpenGadget3 for accuracy in the short-range gravity integrator, hydrodynamic solver components, and conjugate gradient conduction solver. It performs side-by-side CPU/GPU comparisons on four tests of increasing complexity (gravity-only cosmological run, shock tube, non-radiative cluster zoom-in, full-physics galaxy zoom-in) and reports excellent agreement with only small differences on very small scales, plus module speedups of ~3-5x and overall ~2-3x on four supercomputers.

Significance. If the quantitative agreement holds, the work supplies needed validation for GPU-accelerated cosmological hydro codes, directly supporting production use on current supercomputers. The multi-machine testing and progressive inclusion of modules are positive features.

major comments (2)
  1. [Abstract] Abstract: the central claim of 'excellent agreement' rests on a purely qualitative statement; no quantitative error metrics (relative L2 norms, power-spectrum deviations, or convergence tests) are supplied to demonstrate that the noted 'small differences on very small scales' lie inside expected round-off or truncation error.
  2. [Abstract] Abstract: the test suite is described as 'gradually increas[ing] the number of physical modules,' yet no explicit confirmation is given that the conjugate-gradient conduction solver is exercised either in isolation or in combination with full physics over timescales representative of production runs.
minor comments (1)
  1. The abstract lists four supercomputers but supplies no node counts, CPU/GPU configurations, or wall-time tables that would allow readers to reproduce the reported speedups.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive assessment of the work's significance and for the constructive comments. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim of 'excellent agreement' rests on a purely qualitative statement; no quantitative error metrics (relative L2 norms, power-spectrum deviations, or convergence tests) are supplied to demonstrate that the noted 'small differences on very small scales' lie inside expected round-off or truncation error.

    Authors: We agree that the abstract's claim would be strengthened by quantitative metrics. The full manuscript presents detailed visual and profile comparisons (density fields, velocity profiles, power spectra) across the test suite, but does not include explicit L2 norms or similar error measures. We will revise the abstract and add a short quantitative summary in the results sections, reporting e.g. relative L2 differences at the 10^{-4} level or below for integrated quantities, consistent with expected floating-point and truncation errors. revision: yes

  2. Referee: [Abstract] Abstract: the test suite is described as 'gradually increas[ing] the number of physical modules,' yet no explicit confirmation is given that the conjugate-gradient conduction solver is exercised either in isolation or in combination with full physics over timescales representative of production runs.

    Authors: The full-physics galaxy zoom-in run incorporates thermal conduction (via the conjugate-gradient solver) as one of the active modules, and the simulation duration is representative of production galaxy-formation timescales. We did not include an isolated conduction test in this manuscript, as the emphasis was on end-to-end accuracy and performance; module-level validation appeared in the earlier porting paper. To address the comment we will add an explicit statement clarifying the inclusion of the conduction solver in the full-physics test. revision: partial

Circularity Check

0 steps flagged

No circularity: direct CPU-GPU numerical comparison

full rationale

The paper's core claim rests on running identical test suites (gravity-only cosmology, shock tube, non-radiative cluster zoom, full-physics galaxy zoom) on both the GPU-ported and classical CPU versions of OpenGadget3 and reporting side-by-side agreement. No derivation chain, fitted parameters renamed as predictions, or self-referential definitions appear; the single self-citation (Ragagnin et al. 2020) supplies only porting implementation details while the accuracy assessment is performed afresh in the present work. The result is therefore externally falsifiable by re-running the same tests and is not forced by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the domain assumption that the CPU implementation is the reference truth and that the selected test problems are sufficient to expose any porting errors. No free parameters or invented entities are introduced.

axioms (1)
  • domain assumption The original CPU version of OpenGadget3 is treated as the ground-truth reference for accuracy.
    All comparisons are performed against CPU output.

pith-pipeline@v0.9.1-grok · 5860 in / 1048 out tokens · 48940 ms · 2026-06-27T02:45:19.925716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

79 extracted references · 65 canonical work pages · 3 internal anchors

  1. [3]

    GAMER-2: a GPU-accelerated adaptive mesh refinement code -- accuracy, performance, and scalability

    Schive, Hsi-Yu and ZuHone, John A and Goldbaum, Nathan J and Turk, Matthew J and Gaspari, Massimo and Cheng, Chin-Yu , journal=. GAMER-2: a GPU-accelerated adaptive mesh refinement code -- accuracy, performance, and scalability. 2018 , publisher=

  2. [4]

    PKDGRAV3: Beyond Trillion Particle Cosmological Simulations for the Next Era of Galaxy Surveys

    PKDGRAV3: beyond trillion particle cosmological simulations for the next era of galaxy surveys. Computational Astrophysics and Cosmology , keywords =. doi:10.1186/s40668-017-0021-1 , archivePrefix =. 1609.08621 , primaryClass =

  3. [5]

    DISCO-DJ: differentiable particle-mesh simulation software

    Florian List and Oliver Hahn and Thomas Fl \"o ss and Lukas Winkler. DISCO-DJ: differentiable particle-mesh simulation software. 2025

  4. [7]

    arXiv e-prints , keywords =

    The First Star-by-star N -body/Hydrodynamics Simulation of Our Galaxy Coupling with a Surrogate Model. arXiv e-prints , keywords =. doi:10.48550/arXiv.2510.23330 , archivePrefix =. 2510.23330 , primaryClass =

  5. [26]

    Advances in computational Mathematics , volume=

    Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree , author=. Advances in computational Mathematics , volume=. 1995 , publisher=

  6. [30]

    Monthly Notices of the Royal Astronomical Society , volume =

    Berlok, Thomas , title =. Monthly Notices of the Royal Astronomical Society , volume =. 2022 , month =. doi:10.1093/mnras/stac1882 , url =

  7. [32]

    A Survey of Several Finite Difference Methods for Systems of Nonlinear Hyperbolic Conservation Laws

    Review. A Survey of Several Finite Difference Methods for Systems of Nonlinear Hyperbolic Conservation Laws. Journal of Computational Physics , keywords =. doi:10.1016/0021-9991(78)90023-2 , adsurl =

  8. [33]

    OpenACC --- First Experiences with Real-World Applications

    Wienke, Sandra and Springer, Paul and Terboven, Christian and an Mey, Dieter. OpenACC --- First Experiences with Real-World Applications. Euro-Par 2012 Parallel Processing. 2012

  9. [43]

    2016 , publisher=

    Parallel programming with OpenACC , author=. 2016 , publisher=

  10. [44]

    nIFTy galaxy cluster simulations I: dark matter & non-radiative models

    nIFTy galaxy cluster simulations - I. Dark matter and non-radiative models. , keywords =. doi:10.1093/mnras/stw250 , archivePrefix =. 1503.06065 , primaryClass =

  11. [47]

    , keywords =

    Cosmology dependence of galaxy cluster scaling relations. , keywords =. doi:10.1093/mnras/staa1004 , archivePrefix =. 1911.05751 , primaryClass =

  12. [49]

    , keywords =

    Cosmology dependence of halo masses and concentrations in hydrodynamic simulations. , keywords =. doi:10.1093/mnras/staa3523 , archivePrefix =. 2011.05345 , primaryClass =

  13. [50]

    Pylians: Python libraries for the analysis of numerical simulations

  14. [51]

    Seven-Year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Cosmological Interpretation

    Seven-year Wilkinson Microwave Anisotropy Probe (WMAP) Observations: Cosmological Interpretation. , keywords =. doi:10.1088/0067-0049/192/2/18 , archivePrefix =. 1001.4538 , primaryClass =

  15. [60]

    , keywords =

    Supermassive black hole spin evolution in cosmological simulations with OPENGADGET3. , keywords =. doi:10.1051/0004-6361/202348925 , archivePrefix =. 2312.07657 , primaryClass =

  16. [63]

    , title =

    Frigo, Matteo and Johnson, Steven G. , title =. Proceedings of the IEEE , year = 2005, volume = 93, number = 2, pages =

  17. [65]

    , keywords =

    Simulating cosmic structure formation with the GADGET-4 code. , keywords =. doi:10.1093/mnras/stab1855 , archivePrefix =. 2010.03567 , primaryClass =

  18. [67]

    JCAP , keywords =

    Euclid: modelling massive neutrinos in cosmology - a code comparison. , keywords =. doi:10.1088/1475-7516/2023/06/035 , archivePrefix =. 2211.12457 , primaryClass =

  19. [69]

    2020, ApJS, 248, 32, doi: 10.3847/1538-4365/ab908c

    The AREPO Public Code Release. , keywords =. doi:10.3847/1538-4365/ab908c , archivePrefix =. 1909.04667 , primaryClass =

  20. [75]

    1986, Nature, 324, 446, doi: 10.1038/324446a0 Ba˜ nados, E., Venemans, B

    Josh Barnes and Piet Hut . A hierarchical O(N log N) force-calculation algorithm . , 324 0 (6096): 0 446--449, December 1986. doi:10.1038/324446a0

  21. [76]

    A. M. Beck , G. Murante , A. Arth , R.-S. Remus , A. F. Teklu , J. M. F. Donnert , S. Planelles , M. C. Beck , P. F \"o rster , M. Imgrund , K. Dolag , and S. Borgani . An improved SPH scheme for cosmological simulations . , 455 0 (2): 0 2110--2130, January 2016. doi:10.1093/mnras/stv2443

  22. [77]

    A sparse octree gravitational N-body code that runs entirely on the GPU processor

    Jeroen B \'e dorf , Evghenii Gaburov , and Simon Portegies Zwart . A sparse octree gravitational N-body code that runs entirely on the GPU processor . Journal of Computational Physics, 231 0 (7): 0 2825--2839, April 2012. doi:10.1016/j.jcp.2011.12.024

  23. [78]

    , keywords =

    A. Bonafede , K. Dolag , F. Stasyszyn , G. Murante , and S. Borgani . A non-ideal magnetohydrodynamic GADGET: simulating massive galaxy clusters . , 418 0 (4): 0 2234--2250, December 2011. doi:10.1111/j.1365-2966.2011.19523.x

  24. [79]

    B \"o ss , Klaus Dolag , Ulrich P

    Ludwig M. B \"o ss , Klaus Dolag , Ulrich P. Steinwandel , Elena Hern \'a ndez-Mart \' nez , Ildar Khabibullin , Benjamin Seidel , and Jenny G. Sorce . Simulating the LOcal Web (SLOW): III. Synchrotron emission from the local cosmic web . , 692: 0 A232, December 2024. doi:10.1051/0004-6361/202348339

  25. [80]

    Cabez\' o n, Michal Grabarczyk, and Florina M

    Aur\' e lien Cavelan, Rub\' e n M. Cabez\' o n, Michal Grabarczyk, and Florina M. Ciorba. A smoothed particle hydrodynamics mini-app for exascale. In Proceedings of the Platform for Advanced Scientific Computing Conference, PASC '20, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450379939. doi:10.1145/3394277.3401855. URL https://...

  26. [81]

    Variability in cosmological hydrodynamical simulations: How stochastic processes, numerical effects, and reproducibility limits impact predictability

    Chaitra, Antonio Ragagnin, Milena Valentini, Giuseppe Murante, Stefano Borgani, and Giuliano Taffoni. Variability in cosmological hydrodynamical simulations: How stochastic processes, numerical effects, and reproducibility limits impact predictability. Astronomy and Computing, page 101127, 2026. ISSN 2213-1337. doi:https://doi.org/10.1016/j.ascom.2026.101...

  27. [82]

    Dynamical friction and the evolution of black holes in cosmological simulations: A new implementation in OpenGadget3

    Alice Damiano , Milena Valentini , Stefano Borgani , Luca Tornatore , Giuseppe Murante , Antonio Ragagnin , Cinthia Ragone-Figueroa , and Klaus Dolag . Dynamical friction and the evolution of black holes in cosmological simulations: A new implementation in OpenGadget3 . , 692: 0 A81, December 2024. doi:10.1051/0004-6361/202450021

  28. [83]

    Numerical solutions for black hole feeding and feedback in cosmological simulations with opengadget3

    Alice Damiano, Stefano Borgani, Giuseppe Murante, Milena Valentini, Luca Tornatore, and Giuliano Taffoni. Numerical solutions for black hole feeding and feedback in cosmological simulations with opengadget3. Astronomy and Computing, 56: 0 101117, 2026. ISSN 2213-1337. doi:https://doi.org/10.1016/j.ascom.2026.101117. URL https://www.sciencedirect.com/scien...

  29. [84]

    , keywords =

    M. Davis , G. Efstathiou , C. S. Frenk , and S. D. M. White . The evolution of large-scale structure in a universe dominated by cold dark matter . , 292: 0 371--394, May 1985. doi:10.1086/163168

  30. [85]

    , keywords =

    Walter Dehnen and Hossam Aly. Improving convergence in smoothed particle hydrodynamics simulations without pairing instability. Monthly Notices of the Royal Astronomical Society, 425 0 (2): 0 1068--1082, 09 2012. ISSN 0035-8711. doi:10.1111/j.1365-2966.2012.21439.x. URL https://doi.org/10.1111/j.1365-2966.2012.21439.x

  31. [86]

    2010 , Bdsk-Url-1 =

    K. Dolag and F. Stasyszyn . An MHD GADGET for cosmological simulations . , 398 0 (4): 0 1678--1697, October 2009. doi:10.1111/j.1365-2966.2009.15181.x

  32. [87]

    Dolag , M

    K. Dolag , M. Jubelgas , V. Springel , S. Borgani , and E. Rasia . Thermal Conduction in Simulated Galaxy Clusters . , 606 0 (2): 0 L97--L100, May 2004. doi:10.1086/420966

  33. [88]

    2010 , Bdsk-Url-1 =

    K. Dolag , S. Borgani , G. Murante , and V. Springel . Substructures in hydrodynamical cluster simulations . , 399 0 (2): 0 497--514, October 2009. doi:10.1111/j.1365-2966.2009.15034.x

  34. [89]

    o ss , B \

    Klaus Dolag , Ludwig M. B \"o ss , B \"a rbel S. Koribalski , Ulrich P. Steinwandel , and Milena Valentini . Insights on the Origin of Odd Radio Circles from Cosmological Simulations . , 945 0 (1): 0 74, March 2023. doi:10.3847/1538-4357/acb5f5

  35. [90]

    , Bartelmann, M

    Dolag, K. , Bartelmann, M. , and Lesch, H. Evolution and structure of magnetic fields in simulated galaxy clusters. A&A, 387 0 (2): 0 383--395, 2002. doi:10.1051/0004-6361:20020241. URL https://doi.org/10.1051/0004-6361:20020241

  36. [91]

    Donnert , K

    J. Donnert , K. Dolag , G. Brunetti , and R. Cassano . Rise and fall of radio haloes in simulated merging galaxy clusters . , 429 0 (4): 0 3564--3569, March 2013. doi:10.1093/mnras/sts628

  37. [92]

    2010 , Bdsk-Url-1 =

    D. Fabjan , S. Borgani , L. Tornatore , A. Saro , G. Murante , and K. Dolag . Simulating the effect of active galactic nuclei feedback on the metal enrichment of galaxy clusters . , 401 0 (3): 0 1670--1690, January 2010. doi:10.1111/j.1365-2966.2009.15794.x

  38. [93]

    The Formation of Milky Way-mass Disk Galaxies in the First 500 Million Years of a Cold Dark Matter Universe

    Yu Feng , Tiziana Di Matteo , Rupert Croft , Ananth Tenneti , Simeon Bird , Nicholas Battaglia , and Stephen Wilkins . The Formation of Milky Way-mass Disk Galaxies in the First 500 Million Years of a Cold Dark Matter Universe . , 808 0 (1): 0 L17, July 2015. doi:10.1088/2041-8205/808/1/L17

  39. [94]

    Mp-gadget/mp-gadget: A tag for getting a doi, October 2018

    Yu Feng, Simeon Bird, Lauren Anderson, Andreu Font-Ribera, and Chris Pedersen. Mp-gadget/mp-gadget: A tag for getting a doi, October 2018. URL https://doi.org/10.5281/zenodo.1451799

  40. [95]

    Cosmological and idealized simulations of dark matter haloes with velocity-dependent, rare and frequent self-interactions, 04 2024

    Moritz S Fischer, Lenard Kasselmann, Marcus Brüggen, Klaus Dolag, Felix Kahlhoefer, Antonio Ragagnin, Andrew Robertson, and Kai Schmidt-Hoberg. Cosmological and idealized simulations of dark matter haloes with velocity-dependent, rare and frequent self-interactions, 04 2024. ISSN 0035-8711. URL https://doi.org/10.1093/mnras/stae699

  41. [96]

    Matteo Frigo and Steven G. Johnson. The design and implementation of FFTW3 . Proceedings of the IEEE, 93 0 (2): 0 216--231, 2005. Special issue on ``Program Generation, Optimization, and Platform Adaptation''

  42. [97]

    Nicholas Frontiere , J. D. Emberson , Michael Buehlmann , Joseph Adamo , Salman Habib , Katrin Heitmann , and Claude-Andr \'e Faucher-Gigu \`e re . Simulating Hydrodynamics in Cosmology with CRK-HACC . , 264 0 (2): 0 34, February 2023. doi:10.3847/1538-4365/aca58d

  43. [98]

    R. A. Gingold and J. J. Monaghan . Smoothed particle hydrodynamics: theory and application to non-spherical stars. , 181: 0 375--389, November 1977. doi:10.1093/mnras/181.3.375

  44. [99]

    Monthly Notices of the Royal Astronomical Society 526(1), 616–644 (2023)

    Frederick Groth , Ulrich P. Steinwandel , Milena Valentini , and Klaus Dolag . The cosmological simulation code OPENGADGET3 - implementation of meshless finite mass . , 526 0 (1): 0 616--644, November 2023. doi:10.1093/mnras/stad2717

  45. [100]

    Nicolay Hammer, Ferdinand Jamitzky, Helmut Satzger, Momme Allalen, Alexander Block, Anupam Karmakar, Matthias Brehm, Reinhold Bader, Luigi Iapichino, Antonio Ragagnin, Vasilios Karakasis, Dieter Kranzlmueller, Arndt Bode, Herbert Huber, Martin Kuehn, Rui Machado, Daniel Gruenewald, Philipp V. F. Edelmann, Friedrich K. Roepke, Markus Wittmann, Thomas Zeise...

  46. [101]

    Cosmological simulations of black hole growth: AGN luminosities and downsizing

    Michaela Hirschmann , Klaus Dolag , Alexandro Saro , Lisa Bachmann , Stefano Borgani , and Andreas Burkert . Cosmological simulations of black hole growth: AGN luminosities and downsizing . , 442 0 (3): 0 2304--2324, August 2014. doi:10.1093/mnras/stu1023

  47. [102]

    Monthly Notices of the Royal Astronomical Society , author =

    Martin Jubelgas , Volker Springel , and Klaus Dolag . Thermal conduction in cosmological SPH simulations . , 351 0 (2): 0 423--435, June 2004. doi:10.1111/j.1365-2966.2004.07801.x

  48. [103]

    Space-timers — a stack-based hierarchical timing system for c++

    Geray Karademir and Klaus Dolag. Space-timers — a stack-based hierarchical timing system for c++. March 2026. doi:10.2139/ssrn.6321185. URL http://dx.doi.org/10.2139/ssrn.6321185

  49. [104]

    Cornerstone: Octree construction algorithms for scalable particle simulations

    Sebastian Keller, Aur\' e lien Cavelan, Rub\' e n Cabezon, Lucio Mayer, and Florina Ciorba. Cornerstone: Octree construction algorithms for scalable particle simulations. In Proceedings of the Platform for Advanced Scientific Computing Conference, PASC '23, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701900. doi:10.1145/35929...

  50. [105]

    Disco-dj: differentiable particle-mesh simulation software, October 2025

    Florian List, Oliver Hahn, Thomas Fl \"o ss, and Lukas Winkler. Disco-dj: differentiable particle-mesh simulation software, October 2025

  51. [106]

    Steinwandel , and Klaus Dolag

    Tirso Marin-Gilabert , Milena Valentini , Ulrich P. Steinwandel , and Klaus Dolag . The role of physical and numerical viscosity in hydrodynamical instabilities . , 517 0 (4): 0 5971--5991, December 2022. doi:10.1093/mnras/stac3042

  52. [107]

    Marra , T

    V. Marra , T. Castro , D. Camarena , S. Borgani , and A. Ragagnin . The BEHOMO project: Lema \^ tre-Tolman-Bondi N-body simulations . , 664: 0 A179, August 2022. doi:10.1051/0004-6361/202243539

  53. [108]

    Smoothed Particle Hydrodynamics in pkdgrav3 for Shock Physics Simulations

    Thomas Meier , Douglas Potter , Christian Reinhardt , and Joachim Stadel . Smoothed Particle Hydrodynamics in pkdgrav3 for Shock Physics Simulations. I. Hydrodynamics . , 1000 0 (2): 0 266, April 2026. doi:10.3847/1538-4357/ae4e29

  54. [109]

    Mignone , C

    A. Mignone , C. Zanni , P. Tzeferacos , B. van Straalen , P. Colella , and G. Bodo . The PLUTO Code for Adaptive Mesh Computations in Astrophysical Fluid Dynamics . , 198 0 (1): 0 7, January 2012. doi:10.1088/0067-0049/198/1/7

  55. [110]

    OpenMP application program interface version 3.0, May 2008

    OpenMP Architecture Review Board . OpenMP application program interface version 3.0, May 2008. URL http://www.openmp.org/mp-documents/spec30.pdf

  56. [111]

    , keywords =

    R. Pakmor , P. Edelmann , F. K. R \"o pke , and W. Hillebrandt . Stellar GADGET: a smoothed particle hydrodynamics code for stellar astrophysics and its application to Type Ia supernovae from white dwarf mergers . , 424 0 (3): 0 2222--2231, August 2012. doi:10.1111/j.1365-2966.2012.21383.x

  57. [112]

    2010 , Bdsk-Url-1 =

    Margarita Petkova and Volker Springel . An implementation of radiative transfer in the cosmological simulation code GADGET . , 396 0 (3): 0 1383--1403, July 2009. doi:10.1111/j.1365-2966.2009.14843.x

  58. [113]

    Daniel J. Price . Resolving high Reynolds numbers in smoothed particle hydrodynamics simulations of subsonic turbulence . , 420 0 (1): 0 L33--L37, February 2012. doi:10.1111/j.1745-3933.2011.01187.x

  59. [114]

    Antonio Ragagnin , Nikola Tchipev , Michael Bader , Klaus Dolag , and Nicolay J. Hammer . Exploiting the Space Filling Curve Ordering of Particles in the Neighbour Search of Gadget3 . In Advances in Parallel Computing, pages 411--420, May 2016. doi:10.3233/978-1-61499-621-7-411

  60. [115]

    Gadget3 on gpus with openacc, 2020

    Antonio Ragagnin, Klaus Dolag, Mathias Wagner, Claudio Gheller, Conradin Roffler, David Goz, David Hubber, and Alexander Arth. Gadget3 on gpus with openacc, 2020. URL https://ebooks.iospress.nl/doi/10.3233/APC200043

  61. [116]

    Dongsu Ryu and T. W. Jones . Numerical Magnetohydrodynamics in Astrophysics: Algorithm and Tests for One-dimensional Flow . , 442: 0 228, March 1995. doi:10.1086/175437

  62. [117]

    GAMER-2: a GPU-accelerated adaptive mesh refinement code -- accuracy, performance, and scalability

    Hsi-Yu Schive, John A ZuHone, Nathan J Goldbaum, Matthew J Turk, Massimo Gaspari, and Chin-Yu Cheng. GAMER-2: a GPU-accelerated adaptive mesh refinement code -- accuracy, performance, and scalability . Monthly Notices of the Royal Astronomical Society, 481 0 (4): 0 4815--4840, 2018

  63. [119]

    Nitin Shukla, Alessandro Romeo, Caterina Caravita, Lubomir Riha, Ondrej Vysocky, Petr Strakos, Milan Jaros, Jo\ a o Barbosa, Radim Vavrik, Andrea Mignone, Marco Rossazza, Stefano Truzzi, Vittoria Berta, Iacopo Colonnelli, Doriana Medi\' c , Elisabetta Boella, Daniele Gregori, Eva Sciacca, Luca Tornatore, Giuliano Taffoni, Pranab J. Deka, Fabio Bacchini, R...

  64. [120]

    Monthly Notices of the Royal Astronomical Society , volume =

    Volker Springel . The cosmological simulation code GADGET-2 . , 364 0 (4): 0 1105--1134, December 2005. doi:10.1111/j.1365-2966.2005.09655.x

  65. [121]

    2010 , Bdsk-Url-1 =

    Volker Springel . E pur si muove: Galilean-invariant cosmological hydrodynamical simulations on a moving mesh . , 401 0 (2): 0 791--851, January 2010. doi:10.1111/j.1365-2966.2009.15715.x

  66. [122]

    , keywords =

    Volker Springel and Lars Hernquist . Cosmological smoothed particle hydrodynamics simulations: a hybrid multiphase model for star formation . , 339 0 (2): 0 289--311, February 2003. doi:10.1046/j.1365-8711.2003.06206.x

  67. [123]

    Volker Springel , Naoki Yoshida , and Simon D. M. White . GADGET: a code for collisionless and gasdynamical cosmological simulations . , 6 0 (2): 0 79--117, April 2001. doi:10.1016/S1384-1076(01)00042-2

  68. [124]

    Black Holes in Galaxy Mergers: The Formation of Red Elliptical Galaxies

    Volker Springel , Tiziana Di Matteo , and Lars Hernquist . Black Holes in Galaxy Mergers: The Formation of Red Elliptical Galaxies . , 620 0 (2): 0 L79--L82, February 2005. doi:10.1086/428772

  69. [125]

    Steinborn , Klaus Dolag , Michaela Hirschmann , M

    Lisa K. Steinborn , Klaus Dolag , Michaela Hirschmann , M. Almudena Prieto , and Rhea-Silvia Remus . A refined sub-grid model for black hole accretion and AGN feedback in large cosmological simulations . , 448 0 (2): 0 1504--1525, April 2015. doi:10.1093/mnras/stv072

  70. [126]

    M., Mullen, P

    James M. Stone, Patrick D. Mullen, Drummond Fielding, Philipp Grete, Minghao Guo, Philipp Kempski, Elias R. Most, Christopher J. White, and George N. Wong. Athenak: A performance-portable version of the athena++ adaptive mesh refinement framework, feb 2026. URL https://doi.org/10.3847/1538-4365/ae3717

  71. [127]

    Hierarchical Data Format, version 5

    The HDF Group . Hierarchical Data Format, version 5 . URL https://github.com/HDFGroup/hdf5

  72. [128]

    Monthly Notices of the Royal Astronomical Society , author =

    L. Tornatore , S. Borgani , F. Matteucci , S. Recchi , and P. Tozzi . Simulating the metal enrichment of the intracluster medium . , 349 0 (1): 0 L19--L24, March 2004. doi:10.1111/j.1365-2966.2004.07689.x

  73. [129]

    , keywords =

    L. Tornatore , S. Borgani , K. Dolag , and F. Matteucci . Chemical enrichment of galaxy clusters from hydrodynamical simulations . , 382 0 (3): 0 1050--1072, December 2007. doi:10.1111/j.1365-2966.2007.12070.x

  74. [130]

    2010 , Bdsk-Url-1 =

    L. Tornatore , S. Borgani , M. Viel , and V. Springel . The impact of feedback on the low-redshift intergalactic medium . , 402 0 (3): 0 1911--1926, March 2010. doi:10.1111/j.1365-2966.2009.16025.x

  75. [131]

    Tricco, Daniel J

    Terrence S. Tricco, Daniel J. Price, and Matthew R. Bate. Constrained hyperbolic divergence cleaning in smoothed particle magnetohydrodynamics with variable cleaning speeds. Journal of Computational Physics, 322: 0 326--344, 2016. ISSN 0021-9991. doi:https://doi.org/10.1016/j.jcp.2016.06.053. URL https://www.sciencedirect.com/science/article/pii/S0021999116302789

  76. [132]

    Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree

    Holger Wendland. Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Advances in computational Mathematics, 4 0 (1): 0 389--396, 1995

  77. [133]

    Openacc --- first experiences with real-world applications

    Sandra Wienke, Paul Springer, Christian Terboven, and Dieter an Mey. Openacc --- first experiences with real-world applications. In Christos Kaklamanis, Theodore Papatheodorou, and Paul G. Spirakis, editors, Euro-Par 2012 Parallel Processing, pages 859--870, Berlin, Heidelberg, 2012. Springer Berlin Heidelberg. ISBN 978-3-642-32820-6

  78. [134]

    Robert P. C. Wiersma , Joop Schaye , Tom Theuns , Claudio Dalla Vecchia , and Luca Tornatore . Chemical enrichment in cosmological, smoothed particle hydrodynamics simulations . , 399 0 (2): 0 574--600, October 2009. doi:10.1111/j.1365-2966.2009.15331.x

  79. [135]

    Adapting AREPO-RT for exascale computing: GPU acceleration and efficient communication

    Oliver Zier , Rahul Kannan , Aaron Smith , Mark Vogelsberger , and Erkin Verbeek . Adapting AREPO-RT for exascale computing: GPU acceleration and efficient communication . , 533 0 (1): 0 268--286, September 2024. doi:10.1093/mnras/stae1837