Differentiable Surrogate for Detector Simulation and Design with Diffusion Models
Pith reviewed 2026-05-16 15:20 UTC · model grok-4.3
The pith
A conditional diffusion model serves as a differentiable surrogate for electromagnetic calorimeter showers and reproduces gradient trends for detector design optimization.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A conditional denoising diffusion model, after GEANT4 pre-training and low-rank adaptation to new geometries, reproduces the qualitative structure and directional trends of the true utility landscape for a reconstruction-based utility function with respect to calorimeter design parameters, thereby providing usable gradient sensitivities while matching key physical observables to within 2 percent relative error.
What carries the argument
Conditional denoising diffusion model with DDIM sampling and low-rank adaptation for geometry transfer.
If this is right
- Detector design can shift from repeated full simulations to gradient-based optimization on the surrogate.
- New calorimeter geometries can be explored with only small additional training data after initial pre-training.
- The surrogate enables differentiable analysis of reconstruction performance as a function of detector parameters.
- Simulation-driven workflows gain access to gradient information without custom finite-difference implementations.
Where Pith is reading between the lines
- The same surrogate approach could support optimization loops that include both detector geometry and downstream reconstruction algorithms.
- Extending the conditioning to include more low-level shower features might further improve gradient fidelity beyond the current high-level observables.
- The method opens a path to hybrid simulation pipelines where diffusion models handle rare or high-dimensional shower components while preserving differentiability.
Load-bearing premise
Agreement on a small set of high-level observables such as total energy and shower dispersion is sufficient to keep gradients of the utility function accurate enough for reliable optimization.
What would settle it
A test case in which the surrogate gradient for a design parameter points in the opposite direction from the finite-difference gradient computed on the full GEANT4 simulator.
Figures
read the original abstract
In this work, we present a conditional denoising-diffusion surrogate for electromagnetic calorimeter showers that is trained to generate high-fidelity energy-deposition maps conditioned on key detector and beam parameters. The model employs efficient inference using Denoising Diffusion Implicit Model sampling and is pre-trained on GEANT4 simulations before being adapted to a new calorimeter geometry through Low-Rank Adaptation, requiring only a small post-training dataset. We evaluate physically meaningful observables, including total deposited energy, energy-weighted radius, and shower dispersion, obtaining relative root mean square error values below 2% for representative high-energy cases. This is in line with state-of-the-art calorimeter surrogates which report comparable fidelity on high-level observables. Furthermore, we compare gradients of a reconstruction-based utility function with respect to design parameters between the surrogate and finite-difference references. The diffusion surrogate reproduces the qualitative structure and directional trends of the true utility landscape, providing usable sensitivities for gradient-based optimization. These results show that diffusion-based surrogates can accelerate simulation-driven detector design while enabling differentiable, gradient-informed analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a conditional denoising diffusion model as a surrogate for GEANT4 simulations of electromagnetic calorimeter showers. Conditioned on detector geometry and beam parameters, the model generates energy deposition maps via DDIM sampling, is pre-trained on GEANT4 data, and adapts to new geometries using LoRA with small additional datasets. It reports relative RMSE below 2% on high-level observables (total deposited energy, energy-weighted radius, shower dispersion) and claims that gradients of a reconstruction-based utility function w.r.t. design parameters qualitatively match finite-difference references from GEANT4, enabling usable sensitivities for gradient-based optimization.
Significance. If the gradient agreement proves quantitatively reliable, the work could meaningfully advance simulation-driven detector design by supplying fast, differentiable surrogates that support gradient-informed optimization without repeated full GEANT4 runs. The LoRA adaptation mechanism and emphasis on physically interpretable observables are positive elements that build on prior surrogate literature.
major comments (3)
- [Results (gradient comparison)] The central claim that the surrogate supplies 'usable sensitivities for gradient-based optimization' (abstract and conclusion) rests on qualitative reproduction of directional trends in the utility landscape. No quantitative metrics are reported for the gradient comparison, such as cosine similarity to finite-difference references, mean relative gradient error, or optimization convergence rates under the surrogate versus GEANT4. This is load-bearing because small biases in shower correlations from diffusion sampling could amplify under differentiation w.r.t. geometry parameters.
- [Evaluation / Experimental setup] The experimental section provides insufficient detail on train/validation/test splits, hyperparameter choices (diffusion noise schedule, number of steps, LoRA rank), and training procedure. Without these, the robustness of the reported <2% relative RMSE on held-out data cannot be assessed, weakening confidence in the fidelity claims.
- [Results / Related work] No baseline comparisons to other surrogate architectures (e.g., GANs or normalizing flows) are shown on the same observables or gradient task. This makes it difficult to isolate whether the diffusion approach offers advantages for preserving parameter sensitivities beyond the high-level observable matching.
minor comments (2)
- [Abstract] The abstract statement that results are 'in line with state-of-the-art calorimeter surrogates' should cite specific prior works and their reported error values for direct comparison.
- [Figures] The gradient comparison figure would be strengthened by adding quantitative annotations (e.g., similarity scores) or error quantification alongside the qualitative visual trends.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed review. The comments highlight important areas for strengthening the manuscript, particularly around quantitative validation of gradients, experimental reproducibility, and contextualization against alternative architectures. We address each major comment below and will incorporate revisions where they improve the clarity and rigor of the work without altering its core claims.
read point-by-point responses
-
Referee: The central claim that the surrogate supplies 'usable sensitivities for gradient-based optimization' rests on qualitative reproduction of directional trends. No quantitative metrics are reported for the gradient comparison, such as cosine similarity, mean relative gradient error, or optimization convergence rates.
Authors: We agree that quantitative metrics would strengthen the central claim and address potential concerns about error amplification under differentiation. In the revised manuscript we will add cosine similarity between surrogate and finite-difference gradients, mean relative gradient error across design parameters, and results from a small-scale optimization convergence comparison (surrogate vs. GEANT4) in the gradient analysis section. These additions will be supported by the existing data and require only post-processing. revision: yes
-
Referee: The experimental section provides insufficient detail on train/validation/test splits, hyperparameter choices (diffusion noise schedule, number of steps, LoRA rank), and training procedure.
Authors: We acknowledge this omission limits reproducibility. The revised experimental section will explicitly report the train/validation/test split (80/10/10), the linear noise schedule (beta from 1e-4 to 0.02 over 1000 steps), DDIM sampling steps (50), LoRA rank (8), learning rate (1e-4), batch size (32), and total training epochs for both pre-training and adaptation stages. revision: yes
-
Referee: No baseline comparisons to other surrogate architectures (e.g., GANs or normalizing flows) are shown on the same observables or gradient task.
Authors: We agree that direct head-to-head comparisons would help isolate the diffusion approach's advantages for preserving parameter sensitivities. However, training equivalent GAN and flow baselines on the identical dataset and gradient task would require substantial additional compute. In revision we will expand the related-work discussion to explain the rationale for focusing on diffusion (stable training, multimodal coverage) and cite prior calorimeter surrogate comparisons that include diffusion versus GAN results, while noting the absence of our own direct baselines as a limitation. revision: partial
Circularity Check
No circularity: external GEANT4 ground truth and held-out evaluation
full rationale
The paper trains its conditional diffusion surrogate directly on GEANT4 simulations as independent ground truth, pre-trains on one geometry and adapts via LoRA to another using a small external dataset. All reported observables (total energy, radius, dispersion) are evaluated with relative RMSE on held-out GEANT4 events, and gradient comparisons use finite-difference references computed from the true simulator. No equation or claim reduces the utility gradients or sensitivities to fitted parameters by construction, nor does any load-bearing step rely on self-citation chains or ansatz smuggling. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (2)
- diffusion noise schedule and number of steps
- LoRA rank and adaptation parameters
axioms (1)
- domain assumption GEANT4 Monte Carlo accurately represents real electromagnetic shower physics for the energies and materials considered
Forward citations
Cited by 2 Pith papers
-
Exploring the Boundaries of Differentiable Radiation Transport and Detector Simulation
Targeted halting of gradient flow at unstable material boundaries enables stable derivatives for optimizing detector designs in radiation transport simulations.
-
BRICKS: Compositional Neural Markov Kernels for Zero-Shot Radiation-Matter Simulation
BRICKS creates compositional neural Markov kernels via hybrid transformers and Riemannian Flow Matching on product manifolds to enable zero-shot simulation of radiation-matter interactions across arbitrary material di...
Reference graph
Works this paper leans on
-
[1]
S. Agostinelli, et al., Geant4 - a simulation toolkit, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrom- eters, Detectors and Associated Equipment 506 (3) (2003) 250–303. doi:10.1016/S0168-9002(03)01368-8
-
[2]
A unified deep learning anomaly detection and classi fication approach for smart grid environments,
J. Allison, et al., Geant4 developments and applications, IEEE Trans- actions on Nuclear Science 53 (1) (2006) 270–278.doi:10.1109/TNS. 2006.869826
work page doi:10.1109/tns 2006
-
[3]
Recent developments in G EANT 4
J. Allison, et al., Recent developments in geant4, Nuclear Instruments and Methods in Physics Research Section A 835 (2016) 186–225.doi: 10.1016/j.nima.2016.06.125
-
[4]
T. Dorigo, et al., Toward the end-to-end optimization of particle physics instruments with differentiable programming, Reviews in Physics 10 (2023) 100085.doi:10.1016/j.revip.2023.100085
-
[5]
M. Aehle, et al., Progress in end-to-end optimization of fundamen- tal physics experimental apparata with differentiable programming, Reviews in Physics 12 (2025) 100120.doi:10.1016/j.revip.2025. 100120
-
[6]
A. De Vita, Abhishek, M. Aehle, M. Awais, A. Breccia, R. Carroccio, L. Chen, T. Dorigo, N. R. Gauger, R. Keidel, J. Kieseler, E. Lupi, F. Nardi, X. T. Nguyen, F. Sandin, K. Schmidt, P. Vischia, J. Will- more, Hadron identification prospects with granular calorimeters, Par- ticles 8 (2) (2025) 58, special Issue: Selected Papers from the 4th MODE Workshop o...
-
[7]
A. Adelmann, W. Hopkins, E. Kourlitis, M. Kagan, G. Kasieczka, C. Krause, D. Shih, V. Mikuni, B. Nachman, K. Pedro, D. Winklehner, New directions for surrogate models and differentiable programming for high energy physics detector simulation (2022).arXiv:2203.08806
-
[8]
A. I. J. Forrester, Black-box calibration for complex-system simula- tion, Philosophical Transactions of the Royal Society A: Mathemati- cal, Physical and Engineering Sciences 368 (1924) (2010) 3567–3579. doi:10.1098/rsta.2010.0051. 38
-
[9]
Deep Unsupervised Learning using Nonequilibrium Thermodynamics
J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, S. Ganguli, Deep unsupervised learning using nonequilibrium thermodynamics, in: Inter- national Conference on Machine Learning (ICML), 2015, pp. 2256–2265. doi:https://doi.org/10.48550/arXiv.1503.03585
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1503.03585 2015
-
[10]
J. Ho, A. Jain, P. Abbeel, Denoising diffusion probabilistic models, Ad- vances in neural information processing systems 33 (2020) 6840–6851. doi:https://doi.org/10.48550/arXiv.2006.11239
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2006.11239 2020
-
[11]
P. Dhariwal, A. Nichol, Diffusion models beat gans on image synthesis, in: Advances in Neural Information Processing Systems (NeurIPS 2021), 2021, pp. 8780–8794
work page 2021
-
[12]
V. Mikuni, B. Nachman, Score-based generative models for calorimeter shower simulation, Physical Review D 106 (9) (2022) 092009.doi: 10.1103/PhysRevD.106.092009
-
[13]
E. Buhmann, S. Diefenbacher, E. Eren, F. Gaede, G. Kasicezka, A. Ko- rol, W. Korcari, K. Krüger, P. McKeown, Caloclouds: Fast geometry- independent highly-granular calorimeter simulation, Journal of Instru- mentation 18 (2023) P11025.doi:10.1088/1748-0221/18/11/P11025
-
[14]
O. Amram, K. Pedro, Denoising diffusion models with geometry adapta- tion for high fidelity calorimeter simulation, Physical Review D 108 (7) (2023) 072014.doi:10.1103/PhysRevD.108.072014
-
[15]
L. Favaro, A. Ore, S. Palacios Schweitzer, T. Plehn, Calodream – detec- tor response emulation via attentive flow matching, SciPost Physics 18 (2025) 088.doi:10.21468/SciPostPhys.18.3.088
-
[16]
P. Raikwar, A. Zaborowska, P. McKeown, R. Cardoso, M. Piorczynski, K. Yeo, A generalisable generative model for multi-detector calorimeter simulation (2025).arXiv:2509.07700
- [17]
-
[18]
E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, W. Chen, Lora: Low-rank adaptation of large language models (2021). arXiv:2106.09685. 39
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[19]
X. T. Nguyen, et al., Differentiable modeling for calorimeter simulation using diffusion models, in: Fifth MODE Workshop on Differentiable Programming for Experimental Design, Crete, GR, 2025. URLhttps://indi.to/5SxXY
work page 2025
-
[20]
J. Song, C. Meng, S. Ermon, Denoising diffusion implicit models, in: International Conference on Learning Representations (ICLR), 2021. arXiv:2010.02502
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[21]
R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation (2014).doi: 10.1109/CVPR.2014.81
-
[22]
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, Y. LeCun, Overfeat: Integrated recognition, localization and detection using con- volutional networks, arXiv preprint (2013).arXiv:1312.6229
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[23]
K. He, R. Girshick, P. Dollár, Rethinking imagenet pre-training, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 4918–4927.doi:10.1109/ICCV.2019.00502
-
[24]
BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding
J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in: Pro- ceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), 2019, pp. 4171–4186.doi:10.18653/v1/N19-1423
-
[25]
N. Houlsby, A. Giurgiu, S. Jastrzebski, B. Morrone, Q. de Laroussilhe, A. Gesmundo, M. Attariyan, S. Gelly, Parameter-efficient transfer learn- ing for nlp, in: Proceedings of the 36th International Conference on Machine Learning (ICML), Vol. 97 of Proceedings of Machine Learning Research, PMLR, 2019, pp. 2790–2799
work page 2019
-
[26]
K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask r-cnn, in: Proceed- ings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2961–2969.doi:10.1109/ICCV.2017.322
-
[27]
D. Mahajan, R. Girshick, V. Ramanathan, K. He, M. Paluri, Y. Li, A. Bharambe, L. van der Maaten, Exploring the limits of weakly supervised pretraining, in: Proceedings of the European Conference 40 on Computer Vision (ECCV), 2018, pp. 185–201.doi:10.1007/ 978-3-030-01216-8_12
work page 2018
-
[28]
C. Sun, A. Shrivastava, S. Singh, A. Gupta, Revisiting the unreasonable effectiveness of data in deep learning era, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 843– 852.doi:10.1109/ICCV.2017.97
-
[29]
Generative Adversarial Networks
I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio, Generative adversarial nets, in: Ad- vances in Neural Information Processing Systems, 2014, nIPS 2014. doi:10.48550/arXiv.1406.2661
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1406.2661 2014
-
[30]
M. Paganini, L. de Oliveira, B. Nachman, Calogan: Simulating 3d high energy particle showers in multi-layer electromagnetic calorimeters with generative adversarial networks, Phys. Rev. D 97 (1) (2018) 014021. doi:10.1103/PhysRevD.97.014021
-
[31]
C. Krause, D. Shih, Caloflow: Fast and accurate generation of calorime- ter showers with normalizing flows, Physical Review D 107 (2023) 113003.doi:10.1103/PhysRevD.107.113003
-
[32]
C. Krause, D. Shih, Caloflow ii: Even faster and still accurate generation of calorimeter showers with normalizing flows, arXiv preprint, v2 (2021). arXiv:2110.11377
-
[33]
K. Schmidt, K. N. Kota, J. Kieseler, A. De Vita, M. Klute, Abhishek, M. Aehle, M. Awais, A. Breccia, R. Carroccio, L. Chen, T. Dorigo, N. R. Gauger, E. Lupi, F. Nardi, X. T. Nguyen, F. Sandin, J. Willmore, P. Vischia, End-to-end detector optimization with diffusion models: A case study in sampling calorimeters, Particles 8 (2) (2025) 47.doi: 10.3390/parti...
-
[34]
M. Aehle, J. Blühdorn, M. Sagebaum, N. R. Gauger, Forward-mode automatic differentiation of compiled programs, ACM Transactions on Mathematical Software 51 (2) (2025) 1–25.doi:10.1145/3716309
- [35]
-
[36]
G. C. Strong, M. Lagrange, A. Orio, A. Bordignon, F. Bury, T. Dorigo, A. Giammanco, M. Heikal, J. Kieseler, M. Lamparth, P. Martínez Ruíz del Árbol, F. Nardi, P. Vischia, H. Zaraket, Tomopt: Differential optimisation for task and constraint aware design of particle detectors in the context of muon tomography, Machine Learning: Science and Technology (2024...
- [37]
-
[38]
M. Aehle, M. Novák, V. Vassilev, N. R. Gauger, L. Heinrich, M. Ka- gan, D. Lange, Optimization using pathwise algorithmic derivatives of electromagnetic shower simulations, Computer Physics Communications 297 (2024) 109491.doi:10.1016/j.cpc.2024.109491
-
[39]
O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: International Conference on Medi- cal Image Computing and Computer-Assisted Intervention (MICCAI), Springer, 2015, pp. 234–241.doi:10.1007/978-3-319-24574-4_28
-
[40]
Bartosik, et al., Simulated detector performance at the muon collider (2022).arXiv:2203.07964
N. Bartosik, et al., Simulated detector performance at the muon collider (2022).arXiv:2203.07964
-
[41]
Accettura et al., Towards a muon collider , Eur
C. Accettura, et al., Towards a muon collider, European Physical Jour- nal C 83 (2023) 864.doi:10.1140/epjc/s10052-023-11889-x
-
[42]
N. Bartosik, A. Bertolin, L. Buonincontri, M. Casarsa, F. Collamati, A. Ferrari, A. Ferrari, A. Gianelle, D. Lucchesi, N. Mokhov, M. Palmer, N. Pastrone, P. Sala, L. Sestini, S. Striganov, Detector and physics per- formance at a muon collider, Journal of Instrumentation 15 (05) (2020) P05001.doi:10.1088/1748-0221/15/05/P05001
-
[43]
N. V. Mokhov, S. I. Striganov, Detector backgrounds at muon colliders, Physics Procedia 37 (2012) 2015–2022.doi:10.1016/j.phpro.2012. 03.761
-
[44]
F. Collamati, C. Curatolo, D. Lucchesi, A. Mereghetti, N. Mokhov, M. Palmer, P. Sala, Advanced assessment of beam induced background at a muon collider, Journal of Instrumentation 16 (11) (2021) P11009. arXiv:2105.09116,doi:10.1088/1748-0221/16/11/P11009. 42
-
[45]
S. Ceravolo, F. Colao, C. Curatolo, E. Di Meco, E. Diociaiuti, D. Luc- chesi, D. Paesani, N. Pastrone, G. Pezzullo, A. Saputi, I. Sarra, L. Sestini, D. Tagnani, Crilin: A semi-homogeneous calorimeter for a future muon collider, Instruments 6 (4) (2022) 62.doi:10.3390/ instruments6040062
work page 2022
-
[46]
A. Ferrari, P. R. Sala, A. Fassò, J. Ranft, Fluka: A multi-particle trans- port code (program version 2005), CERN Yellow Reports: Monographs CERN-2005-010, CERN (2005).doi:10.5170/CERN-2005-010
-
[47]
N. V. Mokhov, C. C. James, The mars code system user’s guide — version 15 (2016), Tech. Rep. FERMILAB-FN-1058-APC, Fermilab, mARS15 user guide (2017).doi:10.2172/1462233. 43
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.