pith. sign in

arxiv: 2606.10637 · v2 · pith:OVAGBSPLnew · submitted 2026-06-09 · ✦ hep-ex

A Multimodal Domain-Adversarial Network for Fragmentation Background Suppression in AMS Heavy Nuclei Measurements

Pith reviewed 2026-06-27 11:17 UTC · model grok-4.3

classification ✦ hep-ex
keywords cosmic raysAMS spectrometerfragmentation backgrounddomain-adversarial trainingmultimodal neural networksilicon trackertime-of-flight detectorheavy nuclei
0
0 comments X

The pith

A multimodal domain-adversarial network trained on Monte Carlo simulations suppresses fragmentation backgrounds when applied to real AMS flight data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a Multimodal Domain-Adversarial neural network to reduce fragmentation backgrounds that arise when heavier cosmic rays interact in detector materials and limit precision in AMS measurements of heavy nuclei. The model processes data from the silicon tracker and time-of-flight detectors through specialized sub-networks that are fused with multi-head attention. Domain-adversarial training produces representations that remain consistent across the simulation and real-data domains, allowing a model trained only on Monte Carlo events to operate directly on flight data. This is shown for phosphorus nuclei as a benchmark case. The result is a framework intended to support flux measurements of rarer heavy species.

Core claim

The MDA model fuses heterogeneous data from the silicon tracker and time-of-flight detectors using specialized sub-networks combined via multi-head attention; a domain-adversarial training strategy learns invariant representations so that a model trained on Monte Carlo simulations of fragmentation backgrounds can be applied to flight data, thereby suppressing interaction backgrounds between tracker Layers 1 and 2 for improved heavy-nuclei measurements.

What carries the argument

The Multimodal Domain-Adversarial (MDA) neural network, which fuses tracker and time-of-flight data through multi-head attention and employs domain-adversarial training to learn domain-invariant representations from Monte Carlo simulations.

If this is right

  • Enables reliable transfer of background-suppression performance from Monte Carlo training to real flight data without large residual domain shift.
  • Improves the precision of cosmic-ray nuclei flux measurements for heavier and rarer species where fragmentation backgrounds dominate.
  • Supplies a generalizable framework that can be applied to the measurement of other rare cosmic-ray nuclei with the AMS detector.
  • Reduces the contribution of fragmentation backgrounds originating from heavier cosmic rays interacting between tracker Layers 1 and 2.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same architecture could be tested on additional charge species beyond phosphorus to check whether fragmentation patterns of different nuclei require retraining.
  • If the invariant representations hold, the method might support background rejection in other space-based spectrometers that face similar simulation-to-data mismatches.
  • The multi-head attention weights could be inspected on real data to identify which detector signals contribute most to background rejection.
  • Extending the approach to nuclei with even higher charge might reveal limits set by the fidelity of the underlying Monte Carlo event generator.

Load-bearing premise

Monte Carlo simulations of fragmentation backgrounds match the statistical properties of real flight data closely enough that domain-adversarial training removes residual domain shift.

What would settle it

A large drop in background-suppression performance, or the appearance of simulation-specific artifacts, when the trained MDA model is evaluated on a labeled subset of actual AMS flight data containing known fragmentation events.

Figures

Figures reproduced from arXiv: 2606.10637 by Muhammad Waqas, Valerio Formato, Zhen Liu.

Figure 1
Figure 1. Figure 1: The schematic of the AMS detector. provides an independent and highly accurate measurement of the absolute charge for incoming particles ranging from Z = 1 up to Z = 30. 1.2. Heavy nuclei measurement Operating on the International Space Station, the AMS has collected 263 billion cosmic-ray events from May 19, 2011, to November 26, 2024. Leveraging this extensive dataset, AMS has previously published precis… view at source ↗
Figure 2
Figure 2. Figure 2: Examples of normalized data and Monte Carlo distributions for four representative input variables in the rigidity interval 20–50 GV. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Variance inflation factors (VIFs) for the selected input variables. All VIF values remain below 5, indicating the absence of severe [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: MDA architecture 4. Model Training 4.1. Input sample The Multimodal Domain-Adversarial (MDA) network is developed for signal and background discrimination in AMS heavy nuclei analyses. This section describes the event selection criteria and the composition of the MC simulation and flight data samples, utilizing the Phosphorus (P) flux measurement as a representative application. 6 [PITH_FULL_IMAGE:figures… view at source ↗
Figure 5
Figure 5. Figure 5: Training loss (left) and accuracy (right) as a function of training epoch. The total loss (red), signal/ [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: shows the signal/background classifier loss and accuracy evaluated separately on the training and validation sets as a function of epoch. Both the training and validation curves exhibit consistent behavior throughout the training process, with no significant divergence observed between them. The validation loss continues to decrease in parallel with the training loss, and the validation accuracy closely tr… view at source ↗
Figure 7
Figure 7. Figure 7: Output score distribution of the data-MC domain classifier at the best epoch. The predicted probability of being classified as data is shown [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The L1 charge distribution of the reweighted global MC sample. The dashed vertical lines indicate the standard L1 charge cut range applied [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: ROC curve of the classifier showing signal e [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Left: Rigidity dependence of the overall signal e [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Layer 1 charge distributions of phosphorus candidates in a representative rigidity bin, fitted with charge templates for silicon (orange), [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Background between Tracker L1 and L2, before and after applying the ML selection. Two rigidity-independent MDA score thresholds are [PITH_FULL_IMAGE:figures/full_fig_p013_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Selection efficiency of the primary phosphorus signal as a function of rigidity. Two rigidity-independent MDA score thresholds are chosen for demonstration purposes: a tighter cut corresponding to a signal efficiency of ∼80%, and a looser cut corresponding to ∼90%. The error bars include the statistical error and uncertainty of the template-fit background, while the shaded bands indicate the total uncerta… view at source ↗
read the original abstract

The Alpha Magnetic Spectrometer (AMS) aboard the International Space Station provides high-precision measurements of cosmic-ray nuclei fluxes from charge Z=1 to Z=28 and beyond. With negligible charge confusion from non-interacting nuclei, the precision of nuclei flux measurements is primarily limited by fragmentation backgrounds originating from heavier cosmic rays interacting within detector materials, particularly between tracker Layers 1 and 2 (L1-L2). As AMS extends its measurements to heavier and rarer nuclei, these fragmentation backgrounds become increasingly dominant, necessitating advanced background suppression methods. To address this challenge, we introduce a Multimodal Domain-Adversarial (MDA) neural network designed to effectively suppress these interaction backgrounds. The MDA model fuses heterogeneous data from the silicon tracker and time-of-flight detectors using specialized sub-networks combined via multi-head attention. Crucially, a domain-adversarial training strategy is employed to learn invariant representations, enabling the model, which is trained on Monte Carlo simulations, to be reliably applied to flight data. Using phosphorus (P) as a benchmark, we demonstrate its background suppression capabilities. This approach provides a robust, generalizable framework applicable to the measurement of other rare cosmic-ray nuclei with AMS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript claims to introduce a Multimodal Domain-Adversarial (MDA) neural network for suppressing fragmentation backgrounds in AMS cosmic-ray nuclei flux measurements. The model fuses heterogeneous data from the silicon tracker and time-of-flight detectors via specialized sub-networks and multi-head attention; a domain-adversarial training strategy is used to learn invariant representations so that a model trained on Monte Carlo simulations can be applied to flight data. Phosphorus is used as a benchmark to demonstrate background suppression capabilities, providing a generalizable framework for rarer heavy nuclei.

Significance. If the central claim holds and the adversarial training demonstrably removes residual domain shift while delivering measurable gains in background rejection on flight data, the method could improve precision for heavy and rare nuclei fluxes where fragmentation backgrounds dominate, offering a practical tool for AMS analyses.

major comments (2)
  1. [Abstract] Abstract: The assertion that domain-adversarial training produces representations enabling reliable application to flight data is load-bearing for the central claim, yet the text supplies no supporting evidence such as post-training domain-classifier accuracy, MMD/HSIC statistics, or efficiency/purity comparisons on flight data versus a non-adversarial baseline. Without these, the generalization performance cannot be assessed.
  2. [Abstract] Abstract: No quantitative results, error estimates, signal efficiencies, background rejection rates, or comparisons to existing suppression methods are reported, preventing evaluation of whether the MDA approach improves upon current AMS fragmentation-handling techniques.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and will revise the abstract to incorporate the requested quantitative details and supporting metrics from the full text.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The assertion that domain-adversarial training produces representations enabling reliable application to flight data is load-bearing for the central claim, yet the text supplies no supporting evidence such as post-training domain-classifier accuracy, MMD/HSIC statistics, or efficiency/purity comparisons on flight data versus a non-adversarial baseline. Without these, the generalization performance cannot be assessed.

    Authors: The full manuscript reports post-training domain-classifier accuracy near 50% (indicating successful invariance) and reduced MMD statistics between MC and data domains in Section 3.2, along with efficiency/purity metrics for the phosphorus benchmark. Direct head-to-head comparisons versus a non-adversarial baseline on actual flight data are not performed, as the adversarial component is the mechanism enabling transfer; we will add the available domain-invariance metrics and benchmark results to the abstract. revision: yes

  2. Referee: [Abstract] Abstract: No quantitative results, error estimates, signal efficiencies, background rejection rates, or comparisons to existing suppression methods are reported, preventing evaluation of whether the MDA approach improves upon current AMS fragmentation-handling techniques.

    Authors: We agree the abstract is missing these summary statistics. The results section of the manuscript provides signal efficiencies, background rejection rates, and comparisons to standard cut-based methods for the phosphorus case, including associated uncertainties. We will add concise quantitative highlights and error estimates to the revised abstract. revision: yes

Circularity Check

0 steps flagged

No circularity: descriptive ML method with no derivations or self-referential reductions

full rationale

The manuscript describes a multimodal domain-adversarial network architecture and training strategy but contains no equations, derivations, or parameter-fitting steps that reduce a claimed prediction to its own inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked to justify core claims. The domain-adversarial invariance is presented as an empirical training outcome rather than a mathematically forced result. This is the expected non-finding for a methods paper without formal derivations.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit free parameters, axioms, or invented entities beyond standard neural network components; no ledger entries can be extracted.

pith-pipeline@v0.9.1-grok · 5743 in / 1080 out tokens · 22860 ms · 2026-06-27T11:17:17.021012+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 17 canonical work pages

  1. [1]

    doi: 10.1016/j.physrep.2020.09.003

    AMS Collaboration, The Alpha Magnetic Spectrometer (AMS) on the international space station: Part II — Results from the first seven years, Physics Reports894, 1–116 (2021). doi: 10.1016/j.physrep.2020.09.003

  2. [2]

    K. Lübelsmeyer et al., Upgrade of the Alpha Magnetic Spectrometer (AMS-02) for long term operation on the International Space Station (ISS), Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment654, 639–648 (2011). doi: 10.1016/j.nima.2011.06.051

  3. [3]

    V . Bindi et al., Calibration and performance of the AMS-02 time of flight detector in space, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 743, 22–29 (2014). doi: 10.1016/j.nima.2014.01.002

  4. [4]

    B. Alpat et al., The internal alignment and position resolution of the AMS-02 silicon tracker determined with cosmic-ray muons, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment613, 207–217 (2010). doi: 10.1016/j.nima.2009.11.065

  5. [5]

    Ambrosi, V

    G. Ambrosi, V . Choutko, C. Delgado, A. Oliva, Q. Yan and Y . Li, The spatial resolution of the silicon tracker of the Alpha Magnetic Spectrometer, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment869, 29–37 (2017). doi: 10.1016/j.nima.2017.07.014

  6. [6]

    Aceituno et al

    A. Aceituno et al. (AMS Collaboration), Properties of Heavy Cosmic Nuclei Phosphorus, Chlorine, Argon, Potassium, and Calcium: Results from the Alpha Magnetic Spectrometer, Phys. Rev. Lett.136, 241002 (2026). doi: 10.1103/d2vf-fw3v

  7. [7]

    Aguilar et al

    M. Aguilar et al. (AMS Collaboration), Properties of a New Group of Cosmic Nuclei: Results from the Al- pha Magnetic Spectrometer on Sodium, Aluminum, and Nitrogen, Phys. Rev. Lett.127, 021101 (2021). doi: 10.1103/PhysRevLett.127.021101

  8. [8]

    Aguilar et al

    M. Aguilar et al. (AMS Collaboration), Properties of Cosmic-Ray Sulfur and Determination of the Composition of Primary Cosmic-Ray Carbon, Neon, Magnesium, and Sulfur: Ten-Year Results from the Alpha Magnetic Spectrometer, Phys. Rev. Lett.130, 211002 (2023). doi: 10.1103/PhysRevLett.130.211002

  9. [9]

    Jiaxi Liu et al., Neutrino type identification for atmospheric neutrinos in a large homogeneous liquid scintillation detector, Phys. Rev. D112, 012018 (2025). doi: 10.1103/fznt-z257

  10. [10]

    doi: 10.1140/epjc/s10052-024-13724-3

    Wei Jiang, Guihong Huang, Zhen Liu, Wuming Luo, Liangjian Wen and Jianyi Luo, Machine-learning based photon counting for PMT waveforms and its application to the improvement of the energy resolution in large liquid scintillator detectors, The European Physical Journal C85, 69 (2025). doi: 10.1140/epjc/s10052-024-13724-3

  11. [11]

    Zekun Yang et al., First attempt of directionality reconstruction for atmospheric neutrinos in a large homogeneous liquid scintillator detector, Phys. Rev. D109, 052005 (2024). doi: 10.1103/PhysRevD.109.052005 15

  12. [12]

    Q. Yan, V . Choutko, A. Oliva and M. Paniccia, Measurements of nuclear interaction cross sections with the Alpha Magnetic Spectrometer on the International Space Station, Nuclear Physics A996, 121712 (2020). doi: 10.1016/j.nuclphysa.2020.121712

  13. [13]

    url: http://jmlr.org/papers/v17/15-239.html

    Yaroslav Ganin et al., Domain-Adversarial Training of Neural Networks, Journal of Machine Learning Research 17, 1–35 (2016). url: http://jmlr.org/papers/v17/15-239.html

  14. [14]

    Shlomi, P

    J. Shlomi, P. Battaglia, and J.-R. Vlimant, Graph neural networks in particle physics, Mach. Learn.: Sci. Technol. 2, 021001 (2020). doi: 10.1088/2632-2153/abbf9a

  15. [15]

    Cogan, M

    J. Cogan, M. Kagan, E. Strauss, and A. Schwarztman, Jet-images: computer vision inspired techniques for jet tagging, J. High Energy Phys.2015, 118 (2015). doi: 10.1007/JHEP02(2015)118

  16. [16]

    Attention is all you need

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, Attention is all you need, inProceedings of the 31st International Conference on Neural Information Processing Systems (NIPS 2017), 6000–6010 (2017). doi: 10.5555/3295222.3295349

  17. [17]

    Y .-H. H. Tsai, S. Bai, P. P. Liang, J. Z. Kolter, L.-P. Morency, and R. Salakhutdinov, Multimodal Transformer for Unaligned Multimodal Language Sequences, inProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 6558–6569 (2019). doi: 10.18653/v1/P19-1656

  18. [18]

    McEneaney and A

    M. McEneaney and A. V ossen, Domain-adversarial graph neural networks forΛhyperon identification with CLAS12, J. Instrum.18, P06002 (2023). doi: 10.1088/1748-0221/18/06/P06002 16