arxiv: 2602.13198 · v2 · submitted 2026-02-13 · 🌌 astro-ph.HE

Recognition: no theorem link

texttt{GPUmonty}: A GPU-accelerated relativistic Monte Carlo radiative transfer code

Pedro Naethe Motta , Rodrigo Nemmen , Abhishek V. Joshi

Authors on Pith no claims yet

Pith reviewed 2026-05-15 22:00 UTC · model grok-4.3

classification 🌌 astro-ph.HE

keywords GPU accelerationMonte Carloradiative transferrelativisticblack holesCUDAsynchrotronGRMHD

0 comments

The pith

GPUmonty achieves a 12-fold speedup in relativistic Monte Carlo radiative transfer by offloading superphoton calculations to the GPU.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents GPUmonty, a CUDA implementation that accelerates the grmonty code for Monte Carlo radiative transfer in general relativistic settings. It processes large batches of superphotons in parallel on the GPU rather than sequentially on the CPU. This yields a speedup of approximately 12 times on a single GPU, with performance constrained by register pressure. Validation against analytic solutions and existing codes shows relative errors below one percent and the expected Monte Carlo convergence rate. The faster computation supports more extensive modeling of emission from supermassive black holes.

Core claim

By porting photon generation, sampling, tracking, and scattering to the GPU using the SIMT model, GPUmonty delivers roughly 12 times faster execution than the CPU version while preserving statistical accuracy, as confirmed by tests on optically thin synchrotron spheres and GRMHD simulation data where errors stay under one percent.

What carries the argument

The concurrent processing of superphotons on GPU threads via CUDA, replacing the sequential loop of the original grmonty code.

Load-bearing premise

The parallel execution on the GPU maintains the same statistical properties and convergence as the original sequential Monte Carlo algorithm without introducing any biases or artifacts.

What would settle it

Running the same input data through both codes and finding that the GPU version produces spectra differing by more than one percent or fails to show the N to the power of minus one half scaling with number of superphotons.

Figures

Figures reproduced from arXiv: 2602.13198 by Abhishek V. Joshi, Pedro Naethe Motta, Rodrigo Nemmen.

**Figure 1.** Figure 1: compares the fiducial analytical and simulated spectra considering Ns = 108 . There is an excellent agreement between the numerical and analytical spectra, validated by the maximum difference between the numerical and analytical results remaining below 1% in all energy bins [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗

**Figure 2.** Figure 2: shows the normalized integrated error (eq. 28) as we vary the number of superphotons considering Ns = [104 , 105 , 106 , 107 , 108 ]. The convergence scales proportional to 1/ √ Ns, as expected for the statistical error of a Monte Carlo estimator approaching the true underlying distribution. This result matches igrmonty’s. We highlight that, at 108 superphotons, small deviations from the ideal 1/ √ Ns … view at source ↗

**Figure 3.** Figure 3: shows the results from both igrmonty and GPUmonty for Ns = 108 where each bump corresponds to a different scattering. There is an excellent agreement between the two methods. The tails of each bump exhibit increased noise which becomes more pronounced with higher nsc. This elevated noise originates from the tail of the preceding bump, whose photons generally carry smaller weights, making it difficult to … view at source ↗

**Figure 4.** Figure 4: shows the convergence parameter defined in Equation 29. For this test, the integration is carried out up to ν = 1015 Hz, since the large errors observed in the higher-scattering bumps are also present in igrmonty, and would otherwise distort the convergence rate. Eliminating these errors would require running igrmonty with a significantly larger number of superphotons. 4.3. GRMHD simulation In this sectio… view at source ↗

**Figure 5.** Figure 5: compares the spectra computed using the two radiative transfer codes for the selected GRMHD snapshot. The spectra agree closely over the full frequency range, with slight deviations only at the high-frequency tail, where the small number of scattered superphotons leads to Poisson noise. The error remains at the level of ∼ 10−2 across most frequencies and rises substantially only in the high-frequency tail… view at source ↗

**Figure 6.** Figure 6: Panel (a): Performance comparison between GPUmonty and igrmonty (CPU-based) as a function of superphoton number Ns. Top panel: Wallclock time in seconds; GPUmonty in blue circles, igrmonty in red squares. Panel (b): Resulting speedup factor achieved by GPUmonty relative to igrmonty. The speedup peaks at a factor of ∼ 12 as the workload increases [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

read the original abstract

We introduce $\texttt{GPUmonty}$, a CUDA/C-based Monte Carlo radiative transfer code accelerated using graphics processing units (GPUs). $\texttt{GPUmonty}$ derives from the CPU-based code $\texttt{grmonty}$ and offloads the most computationally expensive stages of the calculation -- superphoton generation, sampling, tracking, and scattering -- to the GPU. Whereas $\texttt{grmonty}$ handles photons sequentially, $\texttt{GPUmonty}$ processes large numbers of superphotons concurrently, leveraging the single-instruction, multiple-thread (SIMT) execution model of modern GPUs. Benchmarks demonstrate a speedup of about $12\times$ relative to the original CPU implementation on a single GPU, with runtime limited primarily by register pressure rather than compute or memory bandwidth saturation. We validate the implementation through analytic tests for a optically thin synchrotron sphere, as well as comparisons with $\texttt{igrmonty}$ for scattering synchrotron sphere and GRMHD simulation data. Relative errors remain below a percent level and convergence is consistent with the expected $N_{\rm s}^{-1/2}$ Monte Carlo scaling. By significantly reducing computational costs, GPUmonty enables the extensive parameter space surveys and faster spectra modeling required to interpret horizon-scale observations of supermassive black holes. $\texttt{GPUmonty}$ is publicly available under the GNU General Public License.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The manuscript introduces GPUmonty, a CUDA/C-based GPU-accelerated Monte Carlo radiative transfer code derived from the CPU code grmonty. It offloads superphoton generation, sampling, tracking, and scattering to the GPU for concurrent SIMT processing. Benchmarks report a 12× speedup on a single GPU limited by register pressure. Validation consists of analytic tests on an optically thin synchrotron sphere plus direct comparisons to igrmonty on scattering spheres and GRMHD data, with relative errors below 1% and explicit confirmation of N_s^{-1/2} Monte Carlo convergence.

Significance. If the reported speedup and accuracy hold, the work enables substantially faster spectra modeling and broader parameter-space surveys required for interpreting horizon-scale observations of supermassive black holes. Direct wall-clock benchmarks without fitted parameters, preservation of statistical convergence properties, and public release under the GNU GPL are clear strengths that enhance reproducibility and community utility.

minor comments (1)

[Abstract] Abstract: the phrase 'a optically thin synchrotron sphere' is grammatically incorrect and should read 'an optically thin synchrotron sphere'.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their supportive review, clear summary of the work, and recommendation to accept the manuscript. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents an implementation and benchmark of a GPU port of the existing grmonty Monte Carlo code. All central claims (12x speedup, <1% relative error, N_s^{-1/2} convergence) are established by direct wall-clock timing on hardware and by explicit numerical comparisons to analytic solutions plus the original CPU code. No equations, fitted parameters, or self-citations are used to derive the reported performance or correctness; the results are empirical measurements outside any closed derivation loop.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is an engineering implementation of existing Monte Carlo radiative transfer algorithms; it introduces no new physical free parameters, axioms beyond standard GPU programming assumptions, or invented entities.

axioms (1)

domain assumption The single-instruction multiple-thread (SIMT) execution model of modern GPUs is suitable for concurrent superphoton tracking without altering Monte Carlo statistics.
Invoked when describing the parallel processing of large numbers of superphotons.

pith-pipeline@v0.9.0 · 5556 in / 1294 out tokens · 21320 ms · 2026-05-15T22:00:38.197032+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Sensitivities of Black Hole Images from GRMHD Simulations
astro-ph.HE 2026-04 unverdicted novelty 6.0

Differentiable GRMHD image sensitivities create a structured error landscape that supports gradient-based parameter recovery for black hole imaging under idealized and noisy conditions.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · cited by 1 Pith paper

[1]

M., & Liang, E

Canfield, E., Howard, W. M., & Liang, E. P. 1987, ApJ, 323, 565, doi: 10.1086/165853

work page doi:10.1086/165853 1987
[2]

2013, ApJ, 777, 13, doi: 10.1088/0004-637X/777/1/13

Chan, C.-k., Psaltis, D., & ¨Ozel, F. 2013, ApJ, 777, 13, doi: 10.1088/0004-637X/777/1/13

work page doi:10.1088/0004-637x/777/1/13 2013
[3]

2024, Journal of Open Source Software, 9, 7273, doi: 10.21105/joss.07273

Chang, D. 2024, Journal of Open Source Software, 9, 7273, doi: 10.21105/joss.07273

work page doi:10.21105/joss.07273 2024
[4]

R., Wong, G

Davelaar, J., Ryan, B. R., Wong, G. N., et al. 2023, MNRAS, 526, 5326, doi: 10.1093/mnras/stad3023

work page doi:10.1093/mnras/stad3023 2023
[5]

C., Gammie, C

Dolence, J. C., Gammie, C. F., Mo´ scibrodzka, M., & Leung, P. K. 2009, The Astrophysical Journal Supplement Series, 184, 387, doi: 10.1088/0067-0049/184/2/387 Event Horizon Telescope Collaboration, Akiyama, K.,

work page doi:10.1088/0067-0049/184/2/387 2009
[6]

2019a, ApJ, 875, L1, doi: 10.3847/2041-8213/ab0ec7 Event Horizon Telescope Collaboration, Akiyama, K.,

Alberdi, A., et al. 2019a, ApJ, 875, L1, doi: 10.3847/2041-8213/ab0ec7 Event Horizon Telescope Collaboration, Akiyama, K.,

work page doi:10.3847/2041-8213/ab0ec7 2041
[7]

2019b, The Astrophysical Journal, 875, L5, doi: 10.3847/2041-8213/ab0f43 Event Horizon Telescope Collaboration, Akiyama, K.,

Alberdi, A., et al. 2019b, The Astrophysical Journal, 875, L5, doi: 10.3847/2041-8213/ab0f43 Event Horizon Telescope Collaboration, Akiyama, K.,

work page doi:10.3847/2041-8213/ab0f43 2041
[8]

C., et al

Algaba, J. C., et al. 2021, The Astrophysical Journal Letters, 910, L12, doi: 10.3847/2041-8213/abe71d Event Horizon Telescope Collaboration, Akiyama, K.,

work page doi:10.3847/2041-8213/abe71d 2021
[9]

2022a, ApJL, 930, L12, doi: 10.3847/2041-8213/ac667410.3847/2041- 8213/ac667510.3847/2041-8213/ac6429 Event Horizon Telescope Collaboration, Akiyama, K.,

Alberdi, A., et al. 2022a, ApJL, 930, L12, doi: 10.3847/2041-8213/ac667410.3847/2041- 8213/ac667510.3847/2041-8213/ac6429 Event Horizon Telescope Collaboration, Akiyama, K.,

work page doi:10.3847/2041-8213/ac667410.3847/2041- 2041
[10]

2022b, The Astrophysical Journal Letters, 930, L16, doi: 10.3847/2041-8213/ac6672

Alberdi, A., et al. 2022b, The Astrophysical Journal Letters, 930, L16, doi: 10.3847/2041-8213/ac6672

work page doi:10.3847/2041-8213/ac6672 2041
[11]

G., & Moncrief, V

Fishbone, L. G., & Moncrief, V. 1976, ApJ, 207, 962, doi: 10.1086/154565

work page doi:10.1086/154565 1976
[12]

D., Akiyama, K., Blackburn, L., et al

Johnson, M. D., Akiyama, K., Blackburn, L., et al. 2023, Galaxies, 11, 61, doi: 10.3390/galaxies11030061

work page doi:10.3390/galaxies11030061 2023
[13]

D., Akiyama, K., Baturin, R., et al

Johnson, M. D., Akiyama, K., Baturin, R., et al. 2024, in Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, Vol. 13092, Space Telescopes and Instrumentation 2024: Optical, Infrared, and Millimeter Wave, ed. L. E. Coyle, S. Matsuura, & M. D. Perrin, 130922D, doi: 10.1117/12.3019835

work page doi:10.1117/12.3019835 2024
[14]

1950, Nucleonics, 6, 60

Kahn, H. 1950, Nucleonics, 6, 60

work page 1950
[15]

Kawashima, T., Ohsuga, K., & Takahashi, H. R. 2023, ApJ, 949, 101, doi: 10.3847/1538-4357/acc94a

work page doi:10.3847/1538-4357/acc94a 2023
[16]

S., C´ ardenas-Avenda˜ no, A., & Palumbo, D

Keeble, L. S., C´ ardenas-Avenda˜ no, A., & Palumbo, D. C. M. 2025, PhRvD, 111, 103042, doi: 10.1103/PhysRevD.111.103042

work page doi:10.1103/physrevd.111.103042 2025
[17]

D., & Lifschits, E

Landau, L. D., & Lifschits, E. M. 1975, Course of Theoretical Physics, Vol. Volume 2, The Classical Theory of Fields (Oxford: Pergamon Press)

work page 1975
[18]

K., Gammie, C

Leung, P. K., Gammie, C. F., & Noble, S. C. 2011, ApJ, 737, 21, doi: 10.1088/0004-637X/737/1/21

work page doi:10.1088/0004-637x/737/1/21 2011
[19]

Mihalas, D., & Mihalas, B. W. 1984, Foundations of radiation hydrodynamics Mo´ scibrodzka, M., & Gammie, C. F. 2018, MNRAS, 475, 43, doi: 10.1093/mnras/stx3162

work page doi:10.1093/mnras/stx3162 1984
[20]

A., & Yfantis, A

Moscibrodzka, M. A., & Yfantis, A. I. 2023, The Astrophysical Journal Supplement Series, 265, 22, doi: 10.3847/1538-4365/acb6f9 Naethe Motta, P., Nemmen, R., & Joshi, A. 2026,

work page doi:10.3847/1538-4365/acb6f9 2023
[21]

S., & C´ ardenas-Avenda˜ no, A

GPUmonty, Zenodo, doi: 10.5281/zenodo.18884082 Naethe Motta, P., Prather, B. S., & C´ ardenas-Avenda˜ no, A. 2025, The Astrophysical Journal, 995, 56, doi: 10.3847/1538-4357/ae16a0

work page doi:10.5281/zenodo.18884082 2025
[22]

Palumbo, D. C. M., Gelles, Z., Tiede, P., et al. 2022, The Astrophysical Journal, 939, 107, doi: 10.3847/1538-4357/ac9ab7

work page doi:10.3847/1538-4357/ac9ab7 2022
[23]

2019, The Astrophysical Journal Supplement Series, 243, 26, doi: 10.3847/1538-4365/ab29fd

Porth, O., Chatterjee, K., Narayan, R., et al. 2019, The Astrophysical Journal Supplement Series, 243, 26, doi: 10.3847/1538-4365/ab29fd

work page doi:10.3847/1538-4365/ab29fd 2019
[24]

2021, The Journal of Open Source Software, 6, 3336, doi: 10.21105/joss.03336

Prather, B., Wong, G., Dhruv, V., et al. 2021, The Journal of Open Source Software, 6, 3336, doi: 10.21105/joss.03336

work page doi:10.21105/joss.03336 2021
[25]

S., Dexter, J., Moscibrodzka, M., et al

Prather, B. S., Dexter, J., Moscibrodzka, M., et al. 2023, The Astrophysical Journal, 950, 35, doi: 10.3847/1538-4357/acc586

work page doi:10.3847/1538-4357/acc586 2023
[26]

D., & Krolik, J

Schnittman, J. D., & Krolik, J. H. 2013, ApJ, 777, 11, doi: 10.1088/0004-637X/777/1/11

work page doi:10.1088/0004-637x/777/1/11 2013
[27]

N., et al

Sharma, A., Medeiros, L., Wong, G. N., et al. 2025, Astrophys. J., 985, 40, doi: 10.3847/1538-4357/adc104

work page doi:10.3847/1538-4357/adc104 2025
[28]

C., Andersen, H

Swope, W. C., Andersen, H. C., Berens, P. H., & Wilson, K. R. 1982, The Journal of Chemical Physics, 76, 637, doi: 10.1063/1.442716

work page doi:10.1063/1.442716 1982
[29]

2022, Journal of Open Source Software, 7, 4457, doi: 10.21105/joss.04457

Tiede, P. 2022, Journal of Open Source Software, 7, 4457, doi: 10.21105/joss.04457

work page doi:10.21105/joss.04457 2022
[30]

N., et al

Wong, G. N., et al. 2022, Astrophys. J. Supp., 259, 64, doi: 10.3847/1538-4365/ac582e 13

work page doi:10.3847/1538-4365/ac582e 2022
[31]

I., Mo´ scibrodzka, M

Yfantis, A. I., Mo´ scibrodzka, M. A., Wielgus, M., Vos, J. T., & Jimenez-Rosales, A. 2024, Astronomy &; Astrophysics, 685, A142, doi: 10.1051/0004-6361/202348230

work page doi:10.1051/0004-6361/202348230 2024
[32]

Younsi, Z., Wu, K., & Fuerst, S. V. 2012, Astronomy &; Astrophysics, 545, A13, doi: 10.1051/0004-6361/201219599

work page doi:10.1051/0004-6361/201219599 2012