arxiv: 2604.11834 · v1 · submitted 2026-04-12 · ⚛️ physics.ins-det · hep-ex· hep-ph

Recognition: unknown

An AI-based Detector Simulation and Reconstruction Model for the ALEPH Experiment at LEP

Ya-Feng Lo , Dmitrii Kobylianskii , Benjamin Nachman

Authors on Pith no claims yet

Pith reviewed 2026-05-10 16:21 UTC · model grok-4.3

classification ⚛️ physics.ins-det hep-exhep-ph

keywords detector simulationgenerative modelsALEPH experimentLEP collidermachine learningevent reconstructionlegacy data analysis

0 comments

The pith

A generative model trained on simulations accurately reproduces the ALEPH detector response at event, jet, and particle levels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether Parnassus, a neural generative model, can stand in for the full detector simulation and reconstruction chain of the ALEPH experiment at LEP. When trained only on simulated electron-positron collisions that produce Z bosons decaying to quark pairs, the model matches the original simulation outputs across entire events, reconstructed jets, and individual particles. A sympathetic reader cares because resurrecting decades-old software for historical data is often impractical, so a learned replacement would let physicists reanalyze legacy LEP datasets with modern tools. The clean, low-pileup environment of LEP serves as a controlled test case that shows these methods transfer beyond the LHC detectors they were first built for.

Core claim

Parnassus, trained exclusively on simulated e+e- to Z to qqbar events processed through the ALEPH detector simulation and reconstruction, faithfully reproduces the detector response at the event, jet, and particle levels. The clean e+e- environment without pileup provides a well-controlled benchmark. The results show that modern neural-network-based generative approaches generalize to historical collider experiments with different detector geometries and physics conditions, offering a practical tool for legacy data analysis where archival software is difficult to maintain.

What carries the argument

Parnassus, a generative neural network that learns to map particle-level inputs to full detector-level outputs including simulation and reconstruction effects.

If this is right

Analysts can generate large numbers of ALEPH-like events without running the original simulation software.
Legacy LEP datasets become accessible for new measurements using current computational methods.
The same training approach can be applied to other historical detectors with similar clean collision environments.
Detector response modeling no longer requires maintaining the full original reconstruction code base.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Combining Parnassus with real ALEPH data could allow the model to learn and correct residual simulation imperfections.
Similar generative models could serve as living archives that preserve detector knowledge long after the hardware and code are gone.
The technique might scale to more complex final states once training data from those topologies are included.
It creates a route to standardize simulation across multiple old and new experiments under one learned framework.

Load-bearing premise

Training exclusively on simulated data from the full ALEPH simulation and reconstruction chain is sufficient to capture all relevant detector effects and response without real data or hardware-specific tuning.

What would settle it

A statistically significant mismatch in particle-level momentum spectra, jet energy scales, or event-shape variables between Parnassus outputs and the original ALEPH simulation chain on held-out test events would show the reproduction is incomplete.

Figures

Figures reproduced from arXiv: 2604.11834 by Benjamin Nachman, Dmitrii Kobylianskii, Ya-Feng Lo.

**Figure 2.** Figure 2: FIG. 2. Event-level results. Comparison between the ALEPH particle-flow reference simulation (blue filled), Parnassus [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. Event-level results. Comparison between the ALEPH particle-flow reference simulation (blue filled), Parnassus (red [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. Jet-level results. Comparison between the reference simulation (blue filled) and Parnassus (red line), and Delphes [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. Particle-level results. Comparison between the reference simulation (blue filled) and Parnassus (red line), and Delphes [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. Particle-level results. Comparison between the reference simulation (blue filled) and Parnassus (red line), and Delphes [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

read the original abstract

We present the application of Parnassus, a generative model for full detector simulation and reconstruction, to the ALEPH detector at the Large Electron-Positron Collider (LEP). Training on simulated $e^+e^-$ to Z to qqbar events processed through the ALEPH detector simulation and reconstruction, we demonstrate that Parnassus faithfully reproduces the detector response at the event, jet, and particle levels. The clean $e^+e^-$ environment, free of pileup and characterized by simple event topologies, provides a well-controlled benchmark for evaluating the generative model's fidelity. Our results demonstrate that modern neural-network-based generative simulation approaches, developed primarily for LHC experiments, generalize naturally to historical collider experiments with distinct detector geometries and physics environments. This work shows that Parnassus can be applied beyond the LHC context and serves as an important tool for legacy data analysis where archival software tools are challenging to resurrect.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Parnassus reproduces ALEPH simulation outputs at event, jet, and particle level but offers no comparison to real LEP data.

read the letter

The core result is that an existing generative model trained only on Monte Carlo events passed through the full ALEPH simulation and reconstruction chain can match those same simulated distributions. The paper tests this on clean Z to qqbar events, which is a reasonable choice for a controlled benchmark with no pileup and simple topologies. That part of the work is straightforward and shows the model architecture is not locked to LHC-style detectors or high-pileup conditions. The generalization check is the actual new piece here, even if it is an application rather than a new method. It gives a data point that these tools can handle older, simpler collider environments without major redesign. If the full paper includes side-by-side plots or quantitative metrics like Kolmogorov-Smirnov distances or efficiency ratios, those would be the concrete evidence worth looking at. The main limitation is that all training and testing stays inside the simulation. The abstract and setup give no indication of any direct match to recorded LEP data, so any inaccuracies in the legacy ALEPH simulation (tracking resolution, calorimeter response, particle ID) would simply be learned and reproduced. For legacy data reanalysis, that sim-to-real step is usually the one that matters most, and it is not addressed. This paper is useful for groups already working on detector simulation with generative models or for people trying to revive old e+e- datasets where the original software is hard to maintain. A reader who follows the Parnassus line of work or needs a practical example on historical detectors would get value from the demonstration. It is worth sending to peer review because the benchmark is well chosen and the results, if the quantitative checks hold, provide a clear extension that referees can evaluate on the details of the validation.

Referee Report

2 major / 1 minor

Summary. The manuscript applies the Parnassus generative model for full detector simulation and reconstruction to the ALEPH experiment at LEP. It trains exclusively on Monte Carlo e+e- → Z → qq̄ events processed through the complete ALEPH simulation and reconstruction chain, claiming that the model faithfully reproduces detector response at the event, jet, and particle levels. The work presents the clean LEP environment as a controlled benchmark and argues that LHC-developed AI techniques generalize to historical detectors, offering a tool for legacy data analysis where archival software is difficult to maintain.

Significance. If the central claim holds with proper validation, the result would enable modern re-analyses of archival LEP data without resurrecting legacy code and would demonstrate transferability of generative simulation methods across collider eras and detector designs. The absence of quantitative fidelity metrics and real-data comparisons in the current text, however, limits the demonstrated impact to an internal consistency check within the simulation.

major comments (2)

[Abstract] Abstract: the claim that Parnassus 'faithfully reproduces the detector response at the event, jet, and particle levels' is stated without any quantitative metrics, error bars, distribution comparisons, or numerical fidelity measures, leaving the central assertion unsupported in detail.
[Abstract] Abstract: training is performed exclusively on events from the full ALEPH Monte Carlo simulation and reconstruction chain, yet the manuscript provides no comparison of generated distributions to real LEP data; this gap directly affects the claim of reproducing actual detector response rather than merely the simulation software.

minor comments (1)

[Abstract] The abstract would be strengthened by explicitly naming the observables or summary statistics used to assess fidelity at each level (event, jet, particle).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback on our manuscript. We address each major comment point by point below, indicating where revisions have been made to the next version of the paper.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that Parnassus 'faithfully reproduces the detector response at the event, jet, and particle levels' is stated without any quantitative metrics, error bars, distribution comparisons, or numerical fidelity measures, leaving the central assertion unsupported in detail.

Authors: We agree that the abstract would benefit from explicit quantitative support. In the revised manuscript we have updated the abstract to reference the fidelity metrics, statistical comparisons, error bars, and overlaid distribution plots that are presented in the main text (Sections 3 and 4). The central claim is now tied directly to these quantitative results rather than standing alone. revision: yes
Referee: [Abstract] Abstract: training is performed exclusively on events from the full ALEPH Monte Carlo simulation and reconstruction chain, yet the manuscript provides no comparison of generated distributions to real LEP data; this gap directly affects the claim of reproducing actual detector response rather than merely the simulation software.

Authors: This observation is correct. The present work validates the generative model against the established ALEPH Monte Carlo chain as a controlled benchmark; the original simulation itself was tuned and validated against real LEP data in the experiment's publications. We have revised the manuscript to clarify that Parnassus reproduces the simulated detector response and have added a brief discussion of the implications for legacy real-data re-analysis together with a note that direct real-data comparisons are left for future dedicated studies. revision: partial

Circularity Check

0 steps flagged

No circularity in training-evaluation pipeline

full rationale

The paper applies an existing generative model (Parnassus) to ALEPH by training exclusively on Monte Carlo events passed through the legacy full simulation and reconstruction chain, then evaluates fidelity via direct statistical comparisons to held-out samples from the same simulation at event, jet, and particle levels. This constitutes standard supervised distribution matching with no derivation chain, no fitted parameters renamed as predictions, no self-definitional equations, and no load-bearing self-citations that reduce the central claim to prior unverified work by the same authors. All reported results remain falsifiable against the external simulation benchmark and do not collapse to the inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim rests on the domain assumption that full simulation training data fully represents detector response; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Simulated training events processed through the ALEPH detector simulation and reconstruction accurately represent the true detector response.
Model is trained and evaluated entirely on this simulated data.

pith-pipeline@v0.9.0 · 5463 in / 1050 out tokens · 36613 ms · 2026-05-10T16:21:31.673648+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

28 extracted references · 26 canonical work pages · 3 internal anchors

[1]

GEANT4 — a simulation toolkit

GEANT4 Collaboration, “GEANT4–a simulation toolkit”,Nucl. Instrum. Meth. A506(2003) 250–303, doi:10.1016/S0168-9002(03)01368-8

work page doi:10.1016/s0168-9002(03)01368-8 2003
[2]

Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multilayer Calorimeters

M. Paganini, L. de Oliveira, and B. Nachman, “Accelerating Science with Generative Adversarial Networks: An Application to 3D Particle Showers in Multilayer Calorimeters”,Phys. Rev. Lett.120(2018), no. 4, 042003, doi:10.1103/PhysRevLett.120.042003, arXiv:1705.02355

work page doi:10.1103/physrevlett.120.042003 2018
[3]

Amram et al., CaloChallenge 2022: a community challenge for fast calorimeter simulation , Rept

O. Amram et al., “CaloChallenge 2022: a community challenge for fast calorimeter simulation”,Rept. Prog. Phys.88(2025), no. 11, 116201, doi:10.1088/1361-6633/ae1304,arXiv:2410.21611

work page doi:10.1088/1361-6633/ae1304 2022
[4]

Deep generative models for detector signature simulation: A taxonomic review

B. Hashemi and C. Krause, “Deep generative models for detector signature simulation: A taxonomic review”, Rev. Phys.12(2024) 100092, doi:10.1016/j.revip.2024.100092,arXiv:2312.09597

work page doi:10.1016/j.revip.2024.100092 2024
[5]

Advancing set-conditional set generation: Diffusion models for fast simulation of reconstructed particles

D. Kobylianskii et al., “Advancing set-conditional set generation: Diffusion models for fast simulation of reconstructed particles”,Phys. Rev. D110(2024), no. 9, 092013, doi:10.1103/PhysRevD.110.092013, arXiv:2405.10106

work page doi:10.1103/physrevd.110.092013 2024
[6]

Automated Approach to Accurate, Precise, and Fast Detector Simulation and Reconstruction

E. Dreyer et al., “Automated Approach to Accurate, Precise, and Fast Detector Simulation and Reconstruction”,Phys. Rev. Lett.133(2024), no. 21, 211902, doi:10.1103/PhysRevLett.133.211902, arXiv:2406.01620

work page doi:10.1103/physrevlett.133.211902 2024
[7]

Conditional deep generative models for simultaneous simulation and reconstruction of entire events

E. Dreyer et al., “Conditional deep generative models for simultaneous simulation and reconstruction of entire events”,Phys. Rev. D113(2026), no. 3, 032005, doi:10.1103/14ph-482n,arXiv:2503.19981

work page doi:10.1103/14ph-482n 2026
[8]

The CMS experiment at the CERN LHC

CMS Collaboration, “The CMS Experiment at the CERN LHC”,JINST3(2008) S08004, doi:10.1088/1748-0221/3/08/S08004

work page doi:10.1088/1748-0221/3/08/s08004 2008
[9]

DELPHES 3, A modular framework for fast simulation of a generic collider experiment

DELPHES 3 Collaboration, “DELPHES 3, A modular framework for fast simulation of a generic collider experiment”,JHEP02(2014) 057, doi:10.1007/JHEP02(2014)057,arXiv:1307.6346

work page internal anchor Pith review doi:10.1007/jhep02(2014)057 2014
[10]

Brun et al., “GEANT3”,

R. Brun et al., “GEANT3”,
[11]

Performance of the ALEPH detector at LEP

ALEPH Collaboration, “Performance of the ALEPH detector at LEP”,Nucl. Instrum. Meth. A360(1995) 481–506, doi:10.1016/0168-9002(95)00138-7

work page doi:10.1016/0168-9002(95)00138-7 1995
[12]

Measurements of two-particle correlations ine +e− collisions at 91 GeV with ALEPH archived data

Electron-Positron Alliance Collaboration, “Measurements of two-particle correlations ine +e− collisions at 91 GeV with ALEPH archived data”, Phys. Rev. Lett.123(2019), no. 21, 212002, doi:10.1103/PhysRevLett.123.212002, arXiv:1906.00489

work page doi:10.1103/physrevlett.123.212002 2019
[13]

Jet energy spectrum and substructure in e +e− collisions at 91.2 GeV with ALEPH Archived Data

Electron-Positron Alliance Collaboration, “Jet energy spectrum and substructure in e +e− collisions at 91.2 GeV with ALEPH Archived Data”,JHEP06(2022) 008, doi:10.1007/JHEP06(2022)008,arXiv:2111.09914

work page doi:10.1007/jhep06(2022)008 2022
[14]

Long-range near-side correlation in e+e−collisions at 183-209 GeV with ALEPH archived data

Electron-Positron Alliance Collaboration, “Long-range near-side correlation in e+e−collisions at 183-209 GeV with ALEPH archived data”,Phys. Lett. B856 (2024) 138957, doi:10.1016/j.physletb.2024.138957, arXiv:2312.05084

work page doi:10.1016/j.physletb.2024.138957 2024
[15]

Unbinned measurement of thrust ine +e− collisions at √s = 91.2 GeV with ALEPH archived data

Electron-Positron Alliance Collaboration, “Unbinned measurement of thrust ine +e− collisions at √s= 91.2 GeV with ALEPH archived data”,arXiv:2510.22038

work page arXiv
[16]

Bossiet al., (2025), arXiv:2511.00149 [hep-ph]

Electron-Positron Alliance Collaboration, “Energy Correlators from Partons to Hadrons: Unveiling the Dynamics of the Strong Interactions with Archival ALEPH Data”,arXiv:2511.00149

work page arXiv
[17]

Agentic AI – Physicist Collaboration in Experimental Particle Physics: A Proof-of-Concept Measure- ment with LEP Open Data, 2026

Electron-Positron Alliance Collaboration, “Agentic AI – Physicist Collaboration in Experimental Particle Physics: A Proof-of-Concept Measurement with LEP Open Data”,arXiv:2603.05735

work page arXiv
[18]

Modern jet flavour tagging in hadronic Z decays with archived ALEPH data

M. M. Defranchis et al., “Modern jet flavour tagging in hadronic Z decays with archived ALEPH data”, arXiv:2603.06524

work page arXiv
[19]

Ai agents can already autonomously perform experimental high energy physics,

E. A. Moreno et al., “AI Agents Can Already Autonomously Perform Experimental High Energy Physics”,arXiv:2603.20179

work page arXiv
[20]

EDM4hep - a common event data model for HEP experiments

F. Gaede et al., “EDM4hep - a common event data model for HEP experiments”,PoSICHEP2022(11,
[21]

1237, doi:10.22323/1.414.1237

work page doi:10.22323/1.414.1237
[22]

Flow Matching for Generative Modeling

Y. Lipman et al., “Flow matching for generative modeling”,arXiv preprint arXiv:2210.02747(2022)

work page internal anchor Pith review Pith/arXiv arXiv 2022
[23]

Building Normalizing Flows with Stochastic Interpolants

M. S. Albergo and E. Vanden-Eijnden, “Building normalizing flows with stochastic interpolants”,arXiv preprint arXiv:2209.15571(2022)arXiv:2209.15571

work page internal anchor Pith review arXiv 2022
[24]

Cacciari, G.P

M. Cacciari, G. P. Salam, and G. Soyez, “The anti-ktjet clustering algorithm”,Journal of High Energy Physics 2008(April, 2008) 063–063, doi:10.1088/1126-6708/2008/04/063

work page doi:10.1088/1126-6708/2008/04/063 2008
[25]

Alwall et al., A Standard format for Les Houches event files , Comput

J. Alwall et al., “A Standard format for Les Houches event files”,Comput. Phys. Commun.176(2007) 300–304, doi:10.1016/j.cpc.2006.11.010, arXiv:hep-ph/0609017

work page doi:10.1016/j.cpc.2006.11.010 2007
[26]

A QCD Test for Jets

E. Farhi, “A QCD Test for Jets”,Phys. Rev. Lett.39 (1977) 1587–1588, doi:10.1103/PhysRevLett.39.1587

work page doi:10.1103/physrevlett.39.1587 1977
[27]

Power Counting to Better Jet Observables

A. J. Larkoski, I. Moult, and D. Neill, “Power Counting to Better Jet Observables”,JHEP12(2014) 009, doi:10.1007/JHEP12(2014)009,arXiv:1409.6298

work page doi:10.1007/jhep12(2014)009 2014
[28]

Energy Correlation Functions for Jet Substructure

A. J. Larkoski, G. P. Salam, and J. Thaler, “Energy Correlation Functions for Jet Substructure”,JHEP06 (2013) 108, doi:10.1007/JHEP06(2013)108, arXiv:1305.0007

work page doi:10.1007/jhep06(2013)108 2013