Deep Learning for Energy Estimation and Particle Identification in Gamma-ray Astronomy

(2) Applied Physics Institute of Irkutsk State University (API ISU); 3) ((1) Lomonosov Moscow State University Skobeltsyn Institute of Nuclear Physics (MSU SINP); (3) Irkutsk National Research Technical University; Alexander Kryukov (1); Dmitry Zhurov (2; Evgeny Postnikov (1); Irkutsk; Moscow; Russia; Russia)

arxiv: 1907.10480 · v1 · pith:MWIF4LW2new · submitted 2019-07-23 · 🌌 astro-ph.IM · cs.DC· cs.LG· stat.ML

Deep Learning for Energy Estimation and Particle Identification in Gamma-ray Astronomy

Evgeny Postnikov (1) , Alexander Kryukov (1) , Stanislav Polyakov (1) , Dmitry Zhurov (2 , 3) ((1) Lomonosov Moscow State University Skobeltsyn Institute of Nuclear Physics (MSU SINP) , Moscow , Russia , (2) Applied Physics Institute of Irkutsk State University (API ISU)

show 3 more authors

Irkutsk (3) Irkutsk National Research Technical University Russia)

This is my paper

Pith reviewed 2026-05-24 17:08 UTC · model grok-4.3

classification 🌌 astro-ph.IM cs.DCcs.LGstat.ML

keywords convolutional neural networksgamma-ray astronomyparticle identificationenergy estimationMonte Carlo simulationdeep learningCherenkov telescopes

0 comments

The pith

Convolutional neural networks select gamma-ray events and estimate energies as well as or better than the Hillas method while running faster on GPUs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies convolutional neural networks to gamma-ray event selection and energy estimation using images from Cherenkov telescopes. On simulated data the networks reach selection quality comparable to the standard Hillas analysis and deliver some improvement in energy estimates. Rewriting the code for graphics processing units produces substantially faster run times for both tasks. All performance figures come from Monte Carlo simulations, with checks against real telescope data listed as the next step.

Core claim

Convolutional neural networks adapted to telescope images perform gamma-ray event selection at quality levels matching the Hillas approach and yield improved energy estimates, with both tasks executing significantly faster after complete redevelopment for graphics processing units, as shown on Monte Carlo simulations of the experiment.

What carries the argument

Convolutional neural networks that take telescope camera images as input and output either a classification label for gamma-ray events or a numerical energy estimate.

If this is right

Gamma-ray event selection reaches quality comparable to the Hillas method.
Gamma-ray energy estimation improves over the conventional Hillas-based method.
Both tasks complete in significantly less time after the code is moved to graphics processing units.
All reported results rest on Monte Carlo simulated data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the gains persist on real data, analysis pipelines for other imaging atmospheric Cherenkov telescopes could adopt the same networks.
The GPU version implies the method can scale to the data rates expected from next-generation arrays.
The same image-to-label and image-to-number pattern may transfer to other detector types that record shower images.

Load-bearing premise

Performance measured on Monte Carlo simulations will translate to real data recorded by the telescopes.

What would settle it

A side-by-side comparison of neural-network and Hillas results on actual telescope observations would show whether the reported quality and speed gains hold outside simulation.

Figures

Figures reproduced from arXiv: 1907.10480 by (2) Applied Physics Institute of Irkutsk State University (API ISU), 3) ((1) Lomonosov Moscow State University Skobeltsyn Institute of Nuclear Physics (MSU SINP), (3) Irkutsk National Research Technical University, Alexander Kryukov (1), Dmitry Zhurov (2, Evgeny Postnikov (1), Irkutsk, Moscow, Russia, Russia), Stanislav Polyakov (1).

**Figure 1.** Figure 1: Simulated image examples: generated by high-energy gamma-ray (left) and proton (right). to fit the square one. For that purpose, a transformation to oblique coordinate system was applied to each image, so that each hexagonal image with 560 pixels was transformed to the 31x30 square grid. These square grid images were fed to the input layer of the CNN. For the background suppression, test datasets of gamma-… view at source ↗

**Figure 2.** Figure 2: Convolutional neural network for classification/regression. The network accepts square grid images in oblique coordinate system at the input of convolutional layers. Output of the convolutional layers (extracted features) is fed to the classifier/regressor (full-connected layers) that evaluate the output value. Output of the convolutional layer is fed to the full-connected layers of classifier or regresso… view at source ↗

**Figure 3.** Figure 3: Quality factor vs CNN output parameter (a scalar parameter between 0 and 1 characterizing image similarity to gamma-ray or proton). Though for the nearest gamma-rays there is no improvement over the simplest conventional technique of a linear proportionality to the image size (section 4.2), for the distances above 1o CNN gives significantly better results, and especially does the CNN predicting the ratio o… view at source ↗

**Figure 4.** Figure 4: Predicted energy vs true energy: top pannel TensorFlow, bottom pannel PyTorch (dashed lines are the ‘ideal case’ y=x) [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Absolute energy error distributions [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Relative energy error vs angular distance of the image: TensorFlow (‘TF’, left), PyTorch (‘PT’, right) [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

read the original abstract

Deep learning techniques, namely convolutional neural networks (CNN), have previously been adapted to select gamma-ray events in the TAIGA experiment, having achieved a good quality of selection as compared with the conventional Hillas approach. Another important task for the TAIGA data analysis was also solved with CNN: gamma-ray energy estimation showed some improvement in comparison with the conventional method based on the Hillas analysis. Furthermore, our software was completely redeveloped for the graphics processing unit (GPU), which led to significantly faster calculations in both of these tasks. All the results have been obtained with the simulated data of TAIGA Monte Carlo software; their experimental confirmation is envisaged for the near future.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper applies convolutional neural networks to gamma-ray event selection and energy estimation for the TAIGA experiment, reporting improved selection quality and modest gains in energy reconstruction relative to the conventional Hillas-parameter method. All quantitative results are obtained from TAIGA Monte Carlo simulations; the software was re-implemented for GPUs, yielding substantial speed-ups. The authors explicitly state that experimental confirmation on real data is planned for future work.

Significance. If the reported gains hold on real observations, the CNN approach could improve both analysis quality and throughput for TAIGA and similar IACT arrays. The GPU porting is a practical strength that addresses a common bottleneck in Cherenkov-telescope data processing. However, because every performance number rests on simulated events whose fidelity to real noise, calibration, and atmospheric conditions remains untested, the immediate scientific impact is limited to a proof-of-concept demonstration on Monte Carlo.

major comments (3)

[Abstract, Results] Abstract and Results: no CNN architecture, training procedure, hyper-parameters, loss function, or regularization details are supplied, so the quantitative claims (selection quality, energy resolution) cannot be assessed or reproduced.
[Results] Results: all performance metrics are presented without error bars, bootstrap uncertainties, or statistical significance tests against the Hillas baseline, making it impossible to judge whether the reported improvements exceed statistical fluctuations.
[Results, Discussion] The central comparison is performed exclusively on Monte Carlo events; no section quantifies how well the simulation reproduces the dominant real-data systematics (night-sky background, telescope calibration, atmospheric transmission) that the CNN may exploit differently from Hillas parameters.

minor comments (1)

[Figures] Figure captions and axis labels should explicitly state whether the plotted distributions are for simulated gamma rays, protons, or a mixture.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and indicate where revisions will be made.

read point-by-point responses

Referee: [Abstract, Results] Abstract and Results: no CNN architecture, training procedure, hyper-parameters, loss function, or regularization details are supplied, so the quantitative claims (selection quality, energy resolution) cannot be assessed or reproduced.

Authors: We agree that these implementation details are required for reproducibility. The revised manuscript will include a new subsection in the Methods describing the CNN architecture (number of layers, filter sizes, activation functions), training procedure (data splitting, optimizer, batch size), hyper-parameters, loss function, and regularization methods (e.g., dropout). revision: yes
Referee: [Results] Results: all performance metrics are presented without error bars, bootstrap uncertainties, or statistical significance tests against the Hillas baseline, making it impossible to judge whether the reported improvements exceed statistical fluctuations.

Authors: We accept this criticism. The revised Results section will report all metrics with bootstrap-derived uncertainties and will include statistical comparisons (e.g., paired significance tests) between CNN and Hillas performance to demonstrate that observed gains are not due to statistical fluctuations. revision: yes
Referee: [Results, Discussion] The central comparison is performed exclusively on Monte Carlo events; no section quantifies how well the simulation reproduces the dominant real-data systematics (night-sky background, telescope calibration, atmospheric transmission) that the CNN may exploit differently from Hillas parameters.

Authors: We agree this is a limitation. As already stated in the manuscript, experimental validation on real data is planned for future work and is outside the scope of the present proof-of-concept study. We will expand the Discussion to describe the Monte Carlo modeling of night-sky background, calibration, and atmospheric transmission in greater detail and to note possible differential sensitivities of CNN versus Hillas methods to any residual mismatches. revision: partial

standing simulated objections not resolved

Quantitative assessment of Monte Carlo fidelity to real-data systematics (night-sky background, calibration, atmospheric transmission), which requires analysis of actual TAIGA observations planned for future work.

Circularity Check

0 steps flagged

No circularity: empirical ML comparison on Monte Carlo with no derivation chain

full rationale

The manuscript reports an empirical application of convolutional neural networks to gamma-ray event selection and energy estimation in the TAIGA experiment, with direct performance comparisons against the Hillas method performed on the same Monte Carlo simulation sample. No mathematical derivation, functional form, or uniqueness theorem is claimed or presented; results consist of measured quality metrics and runtime improvements on simulated data, with explicit statement that experimental confirmation remains future work. The single reference to prior adaptation of CNNs does not serve as load-bearing justification for any claimed result, as the present claims rest on the new empirical measurements themselves.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, no training details, and no explicit modeling assumptions beyond the standard use of Monte Carlo simulations; therefore no free parameters, axioms, or invented entities can be identified.

pith-pipeline@v0.9.0 · 5724 in / 1120 out tokens · 20869 ms · 2026-05-24T17:08:28.022865+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Convolutional neural networks ... for background suppression and gamma-ray energy estimation ... on TAIGA Monte Carlo ... Q-factor ... RMSE ... compared with Hillas
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

All results obtained with simulated data ... experimental confirmation envisaged

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

[1]

Weekes, T.C. et al. [Whipple collaboration]: Observation of TeV gamma rays from the Crab Nebula using the atmospheric Cerenkov imaging technique. Astroph. J. 342, 379 (1989)

work page 1989
[2]

EPJ H 37, 459 (2012)

Lorenz, E., Wagner, R.: Very-high energy gamma-ray astronomy. EPJ H 37, 459 (2012)

work page 2012
[3]

et al.: RussianGerman Astroparticle Data Life Cycle Initiative

Bychkov, I. et al.: RussianGerman Astroparticle Data Life Cycle Initiative. Data 3(4), 56 (2018)

work page 2018
[4]

Last accessed 15 May 2019

Astroparticle.online Homepage, urlhttps://astroparticle.online. Last accessed 15 May 2019

work page 2019
[5]

Last accessed 15 May 2019

TAIGA Homepage, urlhttp://taiga-experiment.info. Last accessed 15 May 2019

work page 2019
[6]

Last accessed 15 May 2019

KASCADE Homepage, urlhttp://web.ikp.kit.edu/KASCADE. Last accessed 15 May 2019

work page 2019
[7]

Postnikov, E.B., et al.: Hybrid method for identifying mass groups of primary cosmic rays in the joint operation of IACTs and wide angle Cherenkov timing arrays. J. Phys.: Conf. Series 798, 012030 (2017)

work page 2017
[8]

Nieto, D. et al. for the CTA Consortium: Exploring deep learning as an event clas- siﬁcation method for the Cherenkov Telescope Array. Proceedings of Science 301, PoS(ICRC2017)809 (2017)

work page 2017
[9]

et al.: Application of deep learning methods to analysis of imaging atmo- spheric Cherenkov telescopes data

Shilon, I. et al.: Application of deep learning methods to analysis of imaging atmo- spheric Cherenkov telescopes data. Astroparticle Physics 105, 44–53 (2019)

work page 2019
[10]

on behalf of the VERITAS Collaboration: A citizen-science approach to muon events in imaging atmospheric Cherenkov telescope data: the Muon Hunter

Feng, Q., Jarvis, J. on behalf of the VERITAS Collaboration: A citizen-science approach to muon events in imaging atmospheric Cherenkov telescope data: the Muon Hunter. Proceedings of Science 301, PoS(ICRC2017)826 (2017)

work page 2017
[11]

In: Proc

Hillas, A.M.: Cerenkov light images of EAS produced by primary gamma rays and by nuclei. In: Proc. 19th Int. Cosmic Ray Conf., La Jolla, 1985, p. 445. NASA, Washington, D.C. (1985)

work page 1985
[12]

Last accessed 15 May 2019

PyTorch Homepage, urlhttp://pytorch.org. Last accessed 15 May 2019

work page 2019
[13]

Last accessed 15 May 2019

TensorFlow Homepage, urlhttp://www.tensorﬂow.org. Last accessed 15 May 2019

work page 2019
[14]

et al.: CORSIKA: A Monte Carlo Code to Simulate Extensive Air Show- ers

Heck, D. et al.: CORSIKA: A Monte Carlo Code to Simulate Extensive Air Show- ers. Report FZKA 6019. Forschungszentrum Karlsruhe (1998)

work page 1998
[15]

et al.: Improved energy resolution for VHE gamma ray astronomy with systems of Cherenkov telescopes

Hofmann, W. et al.: Improved energy resolution for VHE gamma ray astronomy with systems of Cherenkov telescopes. Astroparticle Physics 12, 207–216 (2000)

work page 2000
[16]

Radiology 143, 29–36 (1982)

Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)

work page 1982
[17]

Postnikov, E.B., et al.: Gamma/Hadron Separation in Imaging Air Cherenkov Tele- scopes Using Deep Learning Libraries TensorFlow and PyTorch. J. Phys.: Conf. Series 1181, 012048 (2019)

work page 2019
[18]

et al.: ANN-based energy reconstruction procedure for TACTIC γ- ray telescope and its comparison with other conventional methods

Dhar, V.K. et al.: ANN-based energy reconstruction procedure for TACTIC γ- ray telescope and its comparison with other conventional methods. Nucl. Instrum. Meth. A 606 795–805 (2009)

work page 2009

[1] [1]

Weekes, T.C. et al. [Whipple collaboration]: Observation of TeV gamma rays from the Crab Nebula using the atmospheric Cerenkov imaging technique. Astroph. J. 342, 379 (1989)

work page 1989

[2] [2]

EPJ H 37, 459 (2012)

Lorenz, E., Wagner, R.: Very-high energy gamma-ray astronomy. EPJ H 37, 459 (2012)

work page 2012

[3] [3]

et al.: RussianGerman Astroparticle Data Life Cycle Initiative

Bychkov, I. et al.: RussianGerman Astroparticle Data Life Cycle Initiative. Data 3(4), 56 (2018)

work page 2018

[4] [4]

Last accessed 15 May 2019

Astroparticle.online Homepage, urlhttps://astroparticle.online. Last accessed 15 May 2019

work page 2019

[5] [5]

Last accessed 15 May 2019

TAIGA Homepage, urlhttp://taiga-experiment.info. Last accessed 15 May 2019

work page 2019

[6] [6]

Last accessed 15 May 2019

KASCADE Homepage, urlhttp://web.ikp.kit.edu/KASCADE. Last accessed 15 May 2019

work page 2019

[7] [7]

Postnikov, E.B., et al.: Hybrid method for identifying mass groups of primary cosmic rays in the joint operation of IACTs and wide angle Cherenkov timing arrays. J. Phys.: Conf. Series 798, 012030 (2017)

work page 2017

[8] [8]

Nieto, D. et al. for the CTA Consortium: Exploring deep learning as an event clas- siﬁcation method for the Cherenkov Telescope Array. Proceedings of Science 301, PoS(ICRC2017)809 (2017)

work page 2017

[9] [9]

et al.: Application of deep learning methods to analysis of imaging atmo- spheric Cherenkov telescopes data

Shilon, I. et al.: Application of deep learning methods to analysis of imaging atmo- spheric Cherenkov telescopes data. Astroparticle Physics 105, 44–53 (2019)

work page 2019

[10] [10]

on behalf of the VERITAS Collaboration: A citizen-science approach to muon events in imaging atmospheric Cherenkov telescope data: the Muon Hunter

Feng, Q., Jarvis, J. on behalf of the VERITAS Collaboration: A citizen-science approach to muon events in imaging atmospheric Cherenkov telescope data: the Muon Hunter. Proceedings of Science 301, PoS(ICRC2017)826 (2017)

work page 2017

[11] [11]

In: Proc

Hillas, A.M.: Cerenkov light images of EAS produced by primary gamma rays and by nuclei. In: Proc. 19th Int. Cosmic Ray Conf., La Jolla, 1985, p. 445. NASA, Washington, D.C. (1985)

work page 1985

[12] [12]

Last accessed 15 May 2019

PyTorch Homepage, urlhttp://pytorch.org. Last accessed 15 May 2019

work page 2019

[13] [13]

Last accessed 15 May 2019

TensorFlow Homepage, urlhttp://www.tensorﬂow.org. Last accessed 15 May 2019

work page 2019

[14] [14]

et al.: CORSIKA: A Monte Carlo Code to Simulate Extensive Air Show- ers

Heck, D. et al.: CORSIKA: A Monte Carlo Code to Simulate Extensive Air Show- ers. Report FZKA 6019. Forschungszentrum Karlsruhe (1998)

work page 1998

[15] [15]

et al.: Improved energy resolution for VHE gamma ray astronomy with systems of Cherenkov telescopes

Hofmann, W. et al.: Improved energy resolution for VHE gamma ray astronomy with systems of Cherenkov telescopes. Astroparticle Physics 12, 207–216 (2000)

work page 2000

[16] [16]

Radiology 143, 29–36 (1982)

Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)

work page 1982

[17] [17]

Postnikov, E.B., et al.: Gamma/Hadron Separation in Imaging Air Cherenkov Tele- scopes Using Deep Learning Libraries TensorFlow and PyTorch. J. Phys.: Conf. Series 1181, 012048 (2019)

work page 2019

[18] [18]

et al.: ANN-based energy reconstruction procedure for TACTIC γ- ray telescope and its comparison with other conventional methods

Dhar, V.K. et al.: ANN-based energy reconstruction procedure for TACTIC γ- ray telescope and its comparison with other conventional methods. Nucl. Instrum. Meth. A 606 795–805 (2009)

work page 2009