Recognition: unknown
Neural Posterior Estimation for UHECR source inference from 3D propagation simulations
Pith reviewed 2026-05-09 18:23 UTC · model grok-4.3
The pith
A neural model trained on 3D propagation simulations infers source energy, distance, direction, and composition for individual ultra-high energy cosmic ray events with calibrated posteriors and no systematic bias.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors train a model combining a Deep Set encoder and a normalizing flow on roughly 5 million events simulated with CRPropa 3 across many extragalactic magnetic field setups. For each individual ultra-high energy cosmic ray event, the model produces calibrated posterior distributions for the source's energy, distance, direction, and primary composition. On held-out simulations, the parameters are recovered without systematic bias, with direction best constrained and distance least certain, while composition classification accuracy is at least 98.2 percent for all mass groups.
What carries the argument
Deep Set encoder that processes variable numbers of detected secondary particles together with a normalizing flow for density estimation, trained end-to-end on three-dimensional CRPropa 3 propagation simulations.
If this is right
- Source parameters for single events can be inferred directly instead of through statistical population studies.
- The calibrated posteriors allow reliable quantification of uncertainties in source properties.
- Primary composition can be determined per event with high accuracy across all mass groups.
- The framework scales to the large event samples expected from current and future observatories.
- It serves as an interface that makes detailed propagation physics usable in Bayesian source inference.
Where Pith is reading between the lines
- If applied to real data, this could allow matching individual events to specific candidate sources like nearby galaxies or active galactic nuclei.
- The method might be extended to infer properties of the extragalactic magnetic fields in addition to source parameters.
- Combining the inferred posteriors with multi-messenger data from neutrinos or gamma rays could improve source identification.
- Robustness can be tested by retraining or validating on simulations with alternative models of cosmic ray interactions.
Load-bearing premise
The CRPropa 3 simulations with the selected extragalactic magnetic field configurations and particle interaction models are representative enough of real physics that the learned posteriors will be accurate for actual observed events.
What would settle it
Apply the trained model to a set of real ultra-high energy cosmic ray events and check whether the resulting source posterior distributions align with independent catalogs of candidate sources, or test whether the posteriors remain calibrated on new simulations that use different magnetic field strengths or interaction models.
Figures
read the original abstract
The identification of ultra-high energy cosmic ray sources is one of the open challenges of high-energy astrophysics. As charged particles travel through the Universe, they are deflected by extragalactic magnetic fields and lose energy through interactions with background radiation, making source inference highly non-trivial. Existing approaches either rely on simplified propagation models or on computationally prohibitive Monte Carlo methods. Here we present a simulation-based inference framework trained on three-dimensional \texttt{CRPropa~3} propagation simulations that produces calibrated posterior distributions over source energy, distance, direction, and primary composition for individual UHECR events. The model combines a Deep Set encoder, handling the variable number of detected secondary particles, with a normalizing flow, and is trained on approximately 5 million simulated events covering a broad range of extragalactic magnetic field configurations. Validated on held-out simulations, all source parameters are recovered without systematic bias, with directional parameters best constrained and source distance most uncertain, consistent with the underlying propagation physics. Primary composition classification achieves $\geq$~98.2\% accuracy across all mass groups. This framework provides a scalable and physically interpretable interface between detailed propagation simulations and Bayesian source inference relevant for current UHECR data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a simulation-based inference framework that uses a Deep Set encoder combined with a normalizing flow, trained on approximately 5 million CRPropa 3 three-dimensional propagation simulations, to produce posterior distributions over source energy, distance, direction, and primary composition for individual ultra-high energy cosmic ray events. It reports unbiased parameter recovery and primary composition classification accuracy of at least 98.2% on held-out simulations drawn from the same distribution.
Significance. If the learned posteriors remain calibrated when applied outside the training distribution, the method would offer a computationally scalable route to Bayesian source inference that incorporates detailed 3D propagation physics, addressing a key limitation of both simplified analytic models and per-event Monte Carlo approaches in UHECR astrophysics.
major comments (2)
- [Abstract / validation] Abstract and validation section: all reported performance (unbiased recovery of source parameters and ≥98.2% composition accuracy) is demonstrated exclusively on held-out draws from the identical CRPropa 3 simulation ensemble; no sensitivity tests to changes in extragalactic magnetic field turbulence spectra, source evolution assumptions, or hadronic interaction models are presented, which directly bears on whether the posteriors will remain calibrated for real detector data.
- [Methods] Methods section: quantitative details on the Deep Set architecture (number of layers, pooling operation, embedding dimension), the normalizing flow implementation, training procedure (optimizer, learning rate schedule, batch size), and the exact sampling ranges for EGMF parameters are not provided, preventing assessment of model capacity, reproducibility, and potential sensitivity to hyperparameter choices.
minor comments (2)
- [Abstract] The abstract states 'approximately 5 million simulated events' but the main text should give the precise count and the joint distribution over source parameters and EGMF configurations used for training.
- [Results / figures] Figure captions and text should explicitly state the metrics used to quantify 'no systematic bias' (e.g., posterior mean offset, coverage probability) and how directional constraints are measured (e.g., 68% credible interval solid angle).
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. We address each major comment below and indicate the revisions planned for the manuscript.
read point-by-point responses
-
Referee: [Abstract / validation] Abstract and validation section: all reported performance (unbiased recovery of source parameters and ≥98.2% composition accuracy) is demonstrated exclusively on held-out draws from the identical CRPropa 3 simulation ensemble; no sensitivity tests to changes in extragalactic magnetic field turbulence spectra, source evolution assumptions, or hadronic interaction models are presented, which directly bears on whether the posteriors will remain calibrated for real detector data.
Authors: We agree that all quantitative validation results are obtained on held-out events drawn from the same CRPropa 3 ensemble used for training, and that no explicit sensitivity tests to variations in EGMF turbulence spectra, source evolution, or hadronic interaction models are included. This is a genuine limitation when considering application to real detector data, where the true underlying physics may differ. The current work focuses on establishing the feasibility and calibration properties of the inference framework within a fixed, well-specified simulation model. In the revised manuscript we will add a new subsection to the Discussion that explicitly states this scope limitation, discusses the implications for posterior calibration on real events, and outlines the additional robustness studies required before deployment on observational data. revision: partial
-
Referee: [Methods] Methods section: quantitative details on the Deep Set architecture (number of layers, pooling operation, embedding dimension), the normalizing flow implementation, training procedure (optimizer, learning rate schedule, batch size), and the exact sampling ranges for EGMF parameters are not provided, preventing assessment of model capacity, reproducibility, and potential sensitivity to hyperparameter choices.
Authors: We acknowledge the omission of these implementation specifics. In the revised Methods section we will supply the missing quantitative information, including the exact Deep Set architecture (number of layers, pooling operation, embedding dimension), the normalizing flow configuration, the complete training protocol (optimizer, learning-rate schedule, batch size), and the precise sampling ranges employed for the EGMF parameters. These additions will enable readers to assess model capacity, reproduce the results, and evaluate sensitivity to hyperparameter choices. revision: yes
Circularity Check
No circularity: posteriors learned and validated on independent simulation draws
full rationale
The paper trains a Deep Set + normalizing flow model on ~5M CRPropa 3 events and evaluates recovery, bias, and composition accuracy exclusively on held-out draws from the identical simulation distribution. This is standard supervised validation within the training measure; the reported metrics (unbiased parameter recovery, ≥98.2% composition accuracy) are empirical test-set statistics, not quantities that reduce by construction to the training inputs or to any self-citation. No load-bearing step equates a claimed result to a fitted parameter or prior-work ansatz. The framework is self-contained against its stated simulation benchmark.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption CRPropa 3 simulations with the chosen magnetic-field realizations accurately represent the dominant propagation effects for UHECRs.
Reference graph
Works this paper leans on
-
[1]
Aab, A and Abreu, P and Aglietta, M and Samarai, I Al and Albuquerque, I F M and Allekotte, I and Almela, A and Castillo, J Alvarez and Alvarez-Muñiz, J and Anastasi, G A and Anchordoqui, L and Andrada, B and Andringa, S and Aramo, C and Arqueros, F and Arsene, N and Asorey, H and Assis, P and Aublin, J and Avila, G and Badescu, A M and Balaceanu, A and L...
-
[2]
ATL-PHYS-PUB-2020-014 , author=
Deep Sets based Neural Networks for Impact Parameter Flavour Tagging in ATLAS , url=. ATL-PHYS-PUB-2020-014 , author=
2020
-
[3]
2016 , eprint=
Layer Normalization , author=. 2016 , eprint=
2016
-
[4]
Beyond the Local Void: A Data-driven Search for the Origins of the Amaterasu Particle , volume=
Bourriche, Nadine and Capel, Francesca , year=. Beyond the Local Void: A Data-driven Search for the Origins of the Amaterasu Particle , volume=. The Astrophysical Journal , publisher=. doi:10.3847/1538-4357/ae2c89 , number=
-
[5]
and Weber, Fridolin and Whiteson, Daniel , year=
Brandes, Len and Modi, Chirag and Ghosh, Aishik and Farrell, Delaney and Lindblom, Lee and Heinrich, Lukas and Steiner, Andrew W. and Weber, Fridolin and Whiteson, Daniel , year=. Neural simulation-based inference of the neutron star equation of state directly from telescope spectra , volume=. Journal of Cosmology and Astroparticle Physics , publisher=. d...
-
[6]
2016 , eprint=
Approximating Likelihood Ratios with Calibrated Discriminative Classifiers , author=. 2016 , eprint=
2016
-
[7]
2017 , eprint=
Density estimation using Real NVP , author=. 2017 , eprint=
2017
-
[8]
2019 , eprint=
Neural Spline Flows , author=. 2019 , eprint=
2019
-
[9]
Greisen, Kenneth , journal =. 1966 , title =. doi:10.1103/physrevlett.16.748 , pages =
-
[10]
2024 , eprint=
Hierarchical Neural Simulation-Based Inference Over Event Ensembles , author=. 2024 , eprint=
2024
-
[11]
Monthly Notices of the Royal Astronomical Society , volume =
Hackstein, S and Vazza, F and Brüggen, M and Sorce, J G and Gottlöber, S , title = ". Monthly Notices of the Royal Astronomical Society , volume =. 2018 , month =. doi:10.1093/mnras/stx3354 , url =
-
[12]
New constraints on the magnetic field in cosmic web filaments⋆ , DOI= "10.1051/0004-6361/202140526", url= "https://doi.org/10.1051/0004-6361/202140526", journal =
-
[13]
Journal of High Energy Physics , volume =
Komiske, Patrick T. and Metodiev, Eric M. and Thaler, Jesse , year=. Energy flow networks: deep sets for particle jets , volume=. Journal of High Energy Physics , publisher=. doi:10.1007/jhep01(2019)121 , number=
-
[14]
Kotera, Kumiko and Olinto, Angela V , journal =. 2011 , title =. doi:10.1146/annurev-astro-081710-102620 , eprint =
-
[15]
2019 , eprint=
Neural Density Estimation and Likelihood-free Inference , author=. 2019 , eprint=
2019
-
[16]
2021 , eprint=
Normalizing Flows for Probabilistic Modeling and Inference , author=. 2021 , eprint=
2021
-
[17]
2018 , eprint=
Deep Sets , author=. 2018 , eprint=
2018
-
[18]
Batista, Rafael Alves and Becker Tjus, Julia and D. Journal of Cosmology and Astroparticle Physics , year =. doi:10.1088/1475-7516/2022/09/035 , eprint =
-
[19]
and Petrera, Sergio and Salamida, Francesco , title =
Aloisio, Roberto and Boncioli, Denise and di Matteo, Armando and Grillo, Aurelio F. and Petrera, Sergio and Salamida, Francesco , title =. 2017 , month =. doi:10.1088/1475-7516/2017/11/009 , url =
-
[20]
Heinze, Jonas and Fedynitch, Anatoli and Boncioli, Denise and Winter, Walter , title =. 2019 , month =. doi:10.3847/1538-4357/ab05ce , url =
-
[21]
Morejon, Leonel and Kampert, Karl-Heinz , journal =. 2026 , pages =. doi:10.1051/0004-6361/202557405 , url =
-
[22]
A. Coleman and J. Eser and E. Mayotte and F. Sarazin and F.G. Schröder and D. Soldin and T.M. Venters and R. Aloisio and J. Alvarez-Muñiz and R. Ultra high energy cosmic rays The intersection of the Cosmic and Energy Frontiers , journal =. 2023 , issn =. doi:https://doi.org/10.1016/j.astropartphys.2023.102819 , url =
-
[23]
Cosmic cartography with UHECRs: Source constraints from individual events at the highest energies
Bourriche, Nadine and Capel, Francesca. Cosmic cartography with UHECRs: Source constraints from individual events at the highest energies. PoS. doi:10.22323/1.444.0362
-
[24]
Proceedings of the National Academy of Sciences , volume =
Kyle Cranmer and Johann Brehmer and Gilles Louppe , title =. Proceedings of the National Academy of Sciences , volume =. 2020 , doi =
2020
-
[25]
Inference of cosmic-ray source properties by conditional invertible neural networks , url =
Bister, Teresa and Erdmann, Martin and K. Inference of cosmic-ray source properties by conditional invertible neural networks , url =. The European Physical Journal C , number =. 2022 , bdsk-url-1 =. doi:10.1140/epjc/s10052-022-10138-x , id =
-
[26]
Physical Review Letters127(24) (2021) https://doi.org/10.1103/physrevlett.127.241103
Dax, Maximilian and Green, Stephen R. and Gair, Jonathan and Macke, Jakob H. and Buonanno, Alessandra and Schölkopf, Bernhard , journal =. 2021 , title =. doi:10.1103/physrevlett.127.241103 , pmid =. 2106.12594 , abstract =
-
[27]
2025 , eprint=
Simulation-Based Inference: A Practical Guide , author=. 2025 , eprint=
2025
-
[28]
Simulation-based inference for direction reconstruction of ultrahigh-energy cosmic rays with radio arrays , author =. Phys. Rev. D , volume =. 2026 , month =. doi:10.1103/j77n-1pl3 , url =
-
[29]
Castellina, Antonella and others , title =. doi:10.1051/epjconf/201921006002
-
[30]
van Vliet, Arjen and Jasche, Jens and Rachen, Jörg P. Targeting Earth: CRPropa learns to aim. PoS. doi:10.22323/1.358.0447
-
[31]
Proceedings of 37th International Cosmic Ray Conference —
Guido, Eleonora and Collaboration, Pierre Auger and Abreu, Pedro and Aglietta, Marco and Albury, Justin M and Allekotte, Ingomar and Almela, Alejandro and Alvarez-Muñiz, Jaime and Batista, Rafael Alves and Anastasi, Gioacchino Alex and Anchordoqui, Luis A and Andrada, Belén and Andringa, Sofia and Aramo, Carla and Ferreira, Paulo Ricardo Araújo and Velazq...
-
[32]
Muzio, Marco Stein and Unger, Michael and Farrar, Glennys R. , journal =. 2019 , title =. doi:10.1103/physrevd.100.103008 , eprint =
-
[33]
Abbasi, R U and Allen, M G and Arimura, R and Belz, J W and Bergman, D R and Blake, S A and Shin, B K and Buckland, I J and Cheon, B G and Fujii, T and Fujisue, K and Fujita, K and Fukushima, M and Furlich, G D and Gerber, Z R and Globus, N and Hibino, K and Higuchi, R and Honda, K and Ikeda, D and Ito, H and Iwasaki, A and Jeong, S and Jeong, H M and Jui...
-
[34]
Journal of Machine Learning Research , year =
Nitish Srivastava and Geoffrey Hinton and Alex Krizhevsky and Ilya Sutskever and Ruslan Salakhutdinov , title =. Journal of Machine Learning Research , year =
-
[35]
2018 , eprint =
Talts, Sean and Betancourt, Michael and Simpson, Daniel and Vehtari, Aki and Gelman, Andrew , title =. 2018 , eprint =
2018
-
[36]
2017 , eprint =
Arjovsky, Martin and Chintala, Soumith and Bottou, Léon , title =. 2017 , eprint =
2017
-
[37]
Villani, Cédric , title =. 2009 , publisher =. doi:10.1007/978-3-540-71050-9 , url =
-
[38]
2021 , eprint=
Neural Empirical Bayes: Source Distribution Estimation and its Applications to Simulation-Based Inference , author=. 2021 , eprint=
2021
-
[39]
Automatic Posterior Transformation for Likelihood-Free Inference
Automatic Posterior Transformation for Likelihood-Free Inference , author=. 2019 , journal=. 1905.07488 , archivePrefix=
work page Pith review arXiv 2019
-
[40]
Flexible statistical inference for mechanistic models of neural dynamics , author=. 2017 , journal=. 1711.01861 , archivePrefix=
-
[41]
2025 , eprint=
Flexible Gravitational-Wave Parameter Estimation with Transformers , author=. 2025 , eprint=
2025
-
[42]
Advances in neural information processing systems , year=
Fast -free Inference of Simulation Models with Bayesian Conditional Density Estimation , author=. Advances in neural information processing systems , year=. 1605.06376 , archivePrefix=
-
[43]
Advances in Neural Information Processing Systems 32 , editor =
PyTorch: An Imperative Style, High-Performance Deep Learning Library , author =. Advances in Neural Information Processing Systems 32 , editor =. 2019 , publisher =
2019
-
[44]
Heyer, Nils and Glaser, Christian and Glüsenkamp, Thorsten and Ravn, Martin , journal =. 2026 , title =. doi:10.1140/epjc/s10052-026-15424-6 , eprint =
-
[45]
R. U. Abbasi and others , title =. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment , volume =. 2021 , issn =. doi:https://doi.org/10.1016/j.nima.2021.165726 , url =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.