pith. sign in

arxiv: 2509.12012 · v2 · submitted 2025-09-15 · ✦ hep-ex

DeepMET: Improving missing transverse momentum estimation with a deep neural network

Pith reviewed 2026-05-18 16:12 UTC · model grok-4.3

classification ✦ hep-ex
keywords missing transverse momentumdeep neural networkCMSLHCpile-upresolution improvementparticle physics analysis
0
0 comments X

The pith

DeepMET uses a neural network to weight particles and improve missing transverse momentum resolution by 10-30%

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

At the LHC, missing transverse momentum is essential for studying neutrinos and potential new particles like dark matter candidates that escape detection. DeepMET applies a deep neural network to assign a weight to each reconstructed particle based on its properties. The missing momentum is estimated from the negative vector sum of these weighted transverse momenta. This method delivers 10-30% better resolution than existing CMS approaches, works across many event types, trains more easily, and resists interference from extra collisions better.

Core claim

The paper claims that a deep neural network can be trained to produce weights for each reconstructed particle, allowing the missing transverse momentum to be computed as the negative sum of the weighted particle momenta. This yields an estimator that improves resolution by 10-30% compared to other methods in use by CMS, with benefits across diverse final states and enhanced resilience to pile-up.

What carries the argument

DeepMET, a deep neural network that outputs weights for reconstructed particles to enable a weighted negative sum for pTmiss estimation

Load-bearing premise

Monte Carlo simulations accurately reproduce the detector response, particle identification, and pile-up conditions in real LHC data.

What would settle it

Measuring the pTmiss resolution in real collision data and finding no significant improvement over traditional methods would falsify the performance claims.

Figures

Figures reproduced from arXiv: 2509.12012 by CMS Collaboration.

Figure 1
Figure 1. Figure 1: shows the DNN architecture. The inputs are the 11 features of each input PF candidate. After the input layer, each of the three categorical features will go through one embedding layer and become an 8-dimensional tensor, so that the feature space can be learned and the differences can be represented by distances in the embedded space. !! "#$$ = −(%&#!#,! # + (#,!) !& "#$$ = −(%&#!#,& # + (#,&) !!, !& Nx2 O… view at source ↗
Figure 2
Figure 2. Figure 2: Recoil responses of different ⃗p miss T estimators in data (markers) and MC simulations (dashed) after the Z → µµ selections. To properly account for the response effect on the resolutions, for each ⃗p miss T estimator, the average response −⟨u∥ ⟩/⟨qT ⟩ is calculated using the events in the response plateau region (i.e., qT > 150 GeV), and the resolutions are scaled by the inverse of this average response,… view at source ↗
Figure 3
Figure 3. Figure 3: Response-corrected resolutions of u∥ (left) and u⊥ (right) vs. qT of different ⃗p miss T estimators in data after the Z → µµ selections. 0 10 20 30 40 50 ) [GeV] (u σ (13 TeV) -1 16.8 fb CMS MC PF PUPPI DeepMET Data Response corrected 10 20 30 40 Nvtx 0.8 0.9 1 1.1 1.2 Data / MC 0 10 20 30 40 50 ) [GeV] (u σ (13 TeV) -1 16.8 fb CMS MC PF PUPPI DeepMET Data Response corrected 10 20 30 40 Nvtx 0.8 0.9 1 1.1 … view at source ↗
Figure 4
Figure 4. Figure 4: Response-corrected resolutions of u∥ (left) and u⊥ (right) vs. number of reconstructed PVs of different ⃗p miss T estimators in data (solid) and MC simulations (dashed) after the Z → µµ selections. The systematic uncertainties for PUPPI ⃗p miss T due to the JES, the JER, and EU are added in quadrature and displayed with the gray band [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Response-corrected resolutions of u∥ (left) and u⊥ (right) vs. number of reconstructed PVs of different ⃗p miss T estimators in data after the Z → µµ selections in the region with qT smaller than 50 GeV. 10 20 30 40 Nvtx 0 10 20 30 40 50 ) [GeV] (u σ (13 TeV) -1 16.8 fb CMS PF PUPPI DeepMET > 50 GeV T Response corrected, q 10 20 30 40 Nvtx 0 10 20 30 40 50 ) [GeV] (u σ (13 TeV) -1 16.8 fb CMS PF PUPPI Deep… view at source ↗
Figure 6
Figure 6. Figure 6: Response-corrected resolutions of u∥ (left) and u⊥ (right) vs. number of reconstructed PVs of different ⃗p miss T estimators in data after the Z → µµ selections in the region with qT larger than 50 GeV [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Response (upper), response-corrected resolutions of [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Distributions of p miss T (left) and mT (right) of different ⃗p miss T estimators in data after W → µν selections. for low amounts of hard hadronic activity, indicating that DEEPMET particularly helps with the softer contributions to ⃗p miss T . 8 The DEEPMET calibrations The DEEPMET calibrations can be derived from Z+jets events by matching the performance in MC simulations to data. These corrections can … view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of p miss T resolution for various physics processes in simulated events. The considered processes are HH production via gluon fusion with HH → bbττ (upper left), H production via vector boson fusion with H → invisible (upper right), ttH production with either H → bb (middle left) or H → µµ (middle right), the SMS T2b-4bd process (lower left), and the SMS TChiZZ process (lower right) [PITH_FULL… view at source ↗
Figure 10
Figure 10. Figure 10: The ϕ distribution of different ⃗p miss T estimators before (left) and after (right) the xy corrections, in data (markers) and MC simulations (dashed) after the Z → µµ selections. In MC simulation, the u∥ and u⊥ CDF values in the ith qT bin can be found from: p∥ = CDFu∥ ,MC i (u MC ∥ ), p⊥ = CDFu⊥,MC i (u MC ⊥ ). (4) In data, the corresponding u∥ and u⊥ CDF in the same qT bin is given by: u data ∥ = (CDFu… view at source ↗
Figure 11
Figure 11. Figure 11: Data-to-simulation comparisons of DEEPMET p miss T (upper left), recoil pT (upper right), u∥ (lower left), and u⊥ (lower right) after the quantile correction. The underflow (over￾flow) contents are included in the first (last) bin. The gray band represents the systematic uncertainties discussed in Section 8.4 [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Response (upper) and response-corrected resolutions of [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗
read the original abstract

At hadron colliders, the net transverse momentum of particles that do not interact with the detector (missing transverse momentum, $\vec{p}_\mathrm{T}^\text{miss}$) is a crucial observable in many analyses. In the standard model, $\vec{p}_\mathrm{T}^\text{miss}$ originates from neutrinos. Many beyond-the-standard-model particles, such as dark matter candidates, are also expected to leave the experimental apparatus undetected. This paper presents a novel $\vec{p}_\mathrm{T}^\text{miss}$ estimator, DeepMET, which is based on deep neural networks that were developed by the CMS Collaboration at the LHC. The DeepMET algorithm produces a weight for each reconstructed particle based on its properties. The estimator is based on the negative vector sum of the weighted transverse momenta of all reconstructed particles in an event. Compared with other estimators currently employed by CMS, DeepMET improves the $\vec{p}_\mathrm{T}^\text{miss}$ resolution by 10$-$30%, shows improvement for a wide range of final states, is easier to train, and is more resilient against the effects of additional proton-proton interactions accompanying the collision of interest.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents DeepMET, a deep neural network developed by CMS to estimate missing transverse momentum (p_T^miss) at the LHC. The algorithm assigns a learned weight to each reconstructed particle based on its properties and defines the estimator as the negative vector sum of the weighted transverse momenta. Using Monte Carlo simulations with generator-level invisible momentum as the training target, the paper reports that DeepMET improves p_T^miss resolution by 10-30% relative to existing CMS estimators, performs better across a range of final states, is easier to train, and shows greater resilience to pile-up.

Significance. If the reported resolution gains and pile-up resilience translate from simulation to collision data, DeepMET would represent a practical and deployable improvement for a core experimental observable used in many SM and BSM analyses. The per-particle weighting approach is a straightforward extension of existing techniques and could be adopted relatively quickly by the collaboration.

major comments (2)
  1. [Abstract] Abstract and results sections: The central claim of a 10-30% resolution improvement is presented without quantitative details on the exact resolution metric (e.g., RMS, 68% containment, or sigma of the response), the precise baseline estimators being compared, or the composition of the training/validation samples. This information is load-bearing for evaluating whether the numerical gains are robust or depend on particular event topologies.
  2. [Results] Results and validation sections: No data-MC closure test or performance comparison in data control regions with known true p_T^miss (such as Z→μμ or γ+jet events) is described. Because the network is trained exclusively on simulated events, the translation of the claimed gains to real collision data rests on the untested assumption that MC modeling of detector response, particle ID efficiencies, and pile-up is sufficiently accurate; this is a load-bearing point for the paper's applicability claim.
minor comments (2)
  1. [Abstract] The abstract states that DeepMET 'is easier to train' without specifying the training procedure, loss function, or hyperparameter choices relative to prior CMS MET estimators; a brief comparison table would improve clarity.
  2. [Methods] Notation for the weighted sum estimator should be defined explicitly with an equation number in the methods section to avoid ambiguity with standard p_T^miss definitions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript describing DeepMET. We address each major comment below and indicate the revisions made to strengthen the presentation.

read point-by-point responses
  1. Referee: [Abstract] Abstract and results sections: The central claim of a 10-30% resolution improvement is presented without quantitative details on the exact resolution metric (e.g., RMS, 68% containment, or sigma of the response), the precise baseline estimators being compared, or the composition of the training/validation samples. This information is load-bearing for evaluating whether the numerical gains are robust or depend on particular event topologies.

    Authors: We agree that explicit quantitative details strengthen the central claim. In the revised manuscript we have expanded both the abstract and the results section to specify that resolution is quantified via the RMS of the p_T^miss response distribution, that the baselines are the standard CMS particle-flow and PUPPI estimators, and that the training and validation samples comprise simulated ttbar, Z+jets, and W+jets events generated with a range of pile-up conditions. A new table now reports the per-final-state improvements. revision: yes

  2. Referee: [Results] Results and validation sections: No data-MC closure test or performance comparison in data control regions with known true p_T^miss (such as Z→μμ or γ+jet events) is described. Because the network is trained exclusively on simulated events, the translation of the claimed gains to real collision data rests on the untested assumption that MC modeling of detector response, particle ID efficiencies, and pile-up is sufficiently accurate; this is a load-bearing point for the paper's applicability claim.

    Authors: We acknowledge that direct validation in data control regions would provide additional confidence for deployment. The present work focuses on Monte Carlo studies in which the training target is generator-level invisible momentum, allowing controlled evaluation of resolution and pile-up resilience. We have added an explicit discussion of the modeling assumptions and a forward-looking statement that data-MC comparisons in Z→μμ and γ+jet regions are planned for a follow-up analysis. The simulation results remain a necessary first step for algorithm development. revision: partial

Circularity Check

0 steps flagged

No circularity: DeepMET resolution gains are measured against generator-level truth on held-out MC samples

full rationale

The paper trains a DNN to assign per-particle weights whose negative vector sum approximates missing transverse momentum, using generator-level invisible momentum as the explicit training target on simulated events. Resolution improvements (10-30%) are then quantified by direct comparison of the resulting estimator to the same generator-level truth on independent simulated samples. This is a standard supervised learning workflow with external benchmark; the reported metrics do not reduce to a fitted parameter renamed as a prediction, nor to any self-referential definition or self-citation chain. The derivation chain is therefore self-contained against the MC truth benchmark and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The performance claims rest on the fidelity of Monte Carlo simulations and on the neural network's ability to generalize from training to real data; no new physical entities are postulated.

free parameters (1)
  • Neural network weights and biases
    The parameters of the deep neural network are determined by training on simulated events to optimize the missing-momentum resolution or a related loss function.
axioms (1)
  • domain assumption Monte Carlo simulations accurately model real detector response, particle reconstruction, and pile-up conditions
    Standard assumption in LHC machine-learning applications; the network is trained and validated exclusively on these simulations.

pith-pipeline@v0.9.0 · 5722 in / 1330 out tokens · 48044 ms · 2026-05-18T16:12:45.640087+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Uncovering Hidden Systematics in Neural Network Models for High Energy Physics

    cs.LG 2026-05 unverdicted novelty 6.0

    Neural networks for HEP tasks can be fooled at significant rates by subtle perturbations inside uncertainty envelopes, revealing hidden systematics not captured by conventional methods.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · cited by 1 Pith paper · 26 internal anchors

  1. [1]

    The CMS experiment at the CERN LHC

    CMS Collaboration, “The CMS experiment at the CERN LHC”,JINST3(2008) S08004, doi:10.1088/1748-0221/3/08/S08004

  2. [2]

    The ATLAS Experiment at the CERN Large Hadron Collider

    ATLAS Collaboration, “The ATLAS experiment at the CERN Large Hadron Collider”, JINST3(2008) S08003,doi:10.1088/1748-0221/3/08/S08003

  3. [3]

    Pileup mitigation at CMS in 13 TeV data

    CMS Collaboration, “Pileup mitigation at CMS in 13 TeV data”,JINST15(2020) P09018, doi:10.1088/1748-0221/15/09/P09018,arXiv:2003.00503

  4. [4]

    Pileup Per Particle Identification

    D. Bertolini, P . Harris, M. Low, and N. Tran, “Pileup per particle identification”,JHEP10 (2014) 059,doi:10.1007/JHEP10(2014)059,arXiv:1407.6013

  5. [5]

    Pileup-per-particle identification: optimisation for Run 2 Legacy and beyond

    CMS Collaboration, “Pileup-per-particle identification: optimisation for Run 2 Legacy and beyond”, CMS Detector Performance Summary CMS-DP-2021-001, 2021. References 21

  6. [6]

    Performance of the CMS missing transverse energy reconstruction in pp data at sqrt(s) = 8 TeV

    CMS Collaboration, “Performance of the CMS missing transverse momentum reconstruction in pp data at √s= 8 TeV”,JINST10(2015) P02006, doi:10.1088/1748-0221/10/02/P02006,arXiv:1411.0511

  7. [7]

    Evidence for the 125 GeV Higgs boson decaying to a pair of tau leptons

    CMS Collaboration, “Evidence for the 125 GeV Higgs boson decaying to a pair ofτ leptons”,JHEP05(2014) 104,doi:10.1007/JHEP05(2014)104,arXiv:1401.5041

  8. [8]

    HEPData record for this analysis

    “HEPData record for this analysis”, 2025.doi:10.17182/hepdata.159179

  9. [9]

    Performance of the CMS Level-1 trigger in proton-proton collisions at √s = 13 TeV

    CMS Collaboration, “Performance of the CMS Level-1 trigger in proton-proton collisions at √s=13 TeV”,JINST15(2020) P10017, doi:10.1088/1748-0221/15/10/P10017,arXiv:2006.10165

  10. [10]

    The CMS trigger system

    CMS Collaboration, “The CMS trigger system”,JINST12(2017) P01020, doi:10.1088/1748-0221/12/01/P01020,arXiv:1609.02366

  11. [11]

    Performance of the CMS high-level trigger during LHC Run 2

    CMS Collaboration, “Performance of the CMS high-level trigger during LHC Run 2”, JINST19(2024) P11021,doi:10.1088/1748-0221/19/11/P11021, arXiv:2410.17038

  12. [12]

    Description and performance of track and primary-vertex reconstruction with the CMS tracker

    CMS Collaboration, “Description and performance of track and primary-vertex reconstruction with the CMS tracker”,JINST9(2014) P10009, doi:10.1088/1748-0221/9/10/P10009,arXiv:1405.6569

  13. [13]

    Technical proposal for the Phase-II upgrade of the Compact Muon Solenoid

    CMS Collaboration, “Technical proposal for the Phase-II upgrade of the Compact Muon Solenoid”, CMS Technical Proposal CERN-LHCC-2015-010, CMS-TDR-15-02, 2015

  14. [14]

    Particle-flow reconstruction and global event description with the CMS detector

    CMS Collaboration, “Particle-flow reconstruction and global event description with the CMS detector”,JINST12(2017) P10003,doi:10.1088/1748-0221/12/10/P10003, arXiv:1706.04965

  15. [15]

    Performance of the CMS muon detector and muon reconstruction with proton-proton collisions at $\sqrt{s}=$ 13 TeV

    CMS Collaboration, “Performance of the CMS muon detector and muon reconstruction with proton-proton collisions at √s=13 TeV”,JINST13(2018) P06015, doi:10.1088/1748-0221/13/06/P06015,arXiv:1804.04528

  16. [16]

    The anti-k_t jet clustering algorithm

    M. Cacciari, G. P . Salam, and G. Soyez, “The anti-kT jet clustering algorithm”,JHEP04 (2008) 063,doi:10.1088/1126-6708/2008/04/063,arXiv:0802.1189

  17. [17]

    FastJet user manual

    M. Cacciari, G. P . Salam, and G. Soyez, “FastJet user manual”,Eur. Phys. J. C72(2012) 1896,doi:10.1140/epjc/s10052-012-1896-2,arXiv:1111.6097

  18. [18]

    Jet energy scale and resolution in the CMS experiment in pp collisions at 8 TeV

    CMS Collaboration, “Jet energy scale and resolution in the CMS experiment in pp collisions at 8 TeV”,JINST12(2017) P02014, doi:10.1088/1748-0221/12/02/P02014,arXiv:1607.03663

  19. [19]

    Performance of missing transverse momentum reconstruction in proton-proton collisions at $\sqrt{s} =$ 13 TeV using the CMS detector

    CMS Collaboration, “Performance of missing transverse momentum reconstruction in proton-proton collisions at √s=13 TeV using the CMS detector”,JINST14(2019) P07004,doi:10.1088/1748-0221/14/07/P07004,arXiv:1903.06078

  20. [20]

    Precision luminosity measurement in proton-proton collisions at√s = 13 TeV in 2015 and 2016 at CMS

    CMS Collaboration, “Precision luminosity measurement in proton-proton collisions at√s=13 TeV in 2015 and 2016 at CMS”,Eur. Phys. J. C81(2021) 800, doi:10.1140/epjc/s10052-021-09538-2,arXiv:2104.01927

  21. [21]

    The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations

    J. Alwall et al., “The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations”,JHEP07 (2014) 079,doi:10.1007/JHEP07(2014)079,arXiv:1405.0301. 22

  22. [22]

    A New Method for Combining NLO QCD with Shower Monte Carlo Algorithms

    P . Nason, “A new method for combining NLO QCD with shower Monte Carlo algorithms”,JHEP11(2004) 040,doi:10.1088/1126-6708/2004/11/040, arXiv:hep-ph/0409146

  23. [23]

    Matching NLO QCD computations with Parton Shower simulations: the POWHEG method

    S. Frixione, P . Nason, and C. Oleari, “Matching NLO QCD computations with parton shower simulations: the POWHEG method”,JHEP11(2007) 070, doi:10.1088/1126-6708/2007/11/070,arXiv:0709.2092

  24. [24]

    NLO vector-boson production matched with shower in POWHEG

    S. Alioli, P . Nason, C. Oleari, and E. Re, “NLO vector-boson production matched with shower in POWHEG”,JHEP07(2008) 060, doi:10.1088/1126-6708/2008/07/060,arXiv:0805.4802

  25. [25]

    A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX

    S. Alioli, P . Nason, C. Oleari, and E. Re, “A general framework for implementing NLO calculations in shower Monte Carlo programs: the POWHEG BOX”,JHEP06(2010) 043, doi:10.1007/JHEP06(2010)043,arXiv:1002.2581

  26. [26]

    An Introduction to PYTHIA 8.2

    T. Sj ¨ostrand et al., “An introduction to PYTHIA 8.2”,Comput. Phys. Commun.191(2015) 159,doi:10.1016/j.cpc.2015.01.024,arXiv:1410.3012

  27. [27]

    Tuning PYTHIA 8.1: the Monash 2013 Tune

    P . Skands, S. Carrazza, and J. Rojo, “Tuning PYTHIA 8.1: the Monash 2013 tune”,Eur. Phys. J. C74(2014) 3024,doi:10.1140/epjc/s10052-014-3024-y, arXiv:1404.5630

  28. [28]

    Extraction and validation of a new set of CMS PYTHIA8 tunes from underlying-event measurements

    CMS Collaboration, “Extraction and validation of a new set of CMSPYTHIA8 tunes from underlying-event measurements”,Eur. Phys. J. C80(2020) 4, doi:10.1140/epjc/s10052-019-7499-4,arXiv:1903.12179

  29. [29]

    Parton distributions from high-precision collider data

    NNPDF Collaboration, “Parton distributions from high-precision collider data”,Eur. Phys. J. C77(2017) 663,doi:10.1140/epjc/s10052-017-5199-5, arXiv:1706.00428

  30. [30]

    Simplified Models for a First Characterization of New Physics at the LHC

    J. Alwall, P . Schuster, and N. Toro, “Simplified models for a first characterization of new physics at the LHC”,Phys. Rev. D79(2009) 075020, doi:10.1103/PhysRevD.79.075020,arXiv:0810.3921

  31. [31]

    Simplified Models for LHC New Physics Searches

    LHC New Physics Working Group, “Simplified models for LHC new physics searches”, J. Phys. G39(2012) 105005,doi:10.1088/0954-3899/39/10/105005, arXiv:1105.2838

  32. [32]

    Search for supersymmetry in proton-proton collisions at 13 TeV in final states with jets and missing transverse momentum

    CMS Collaboration, “Search for supersymmetry in proton-proton collisions at 13 TeV in final states with jets and missing transverse momentum”,JHEP10(2019) 244, doi:10.1007/JHEP10(2019)244,arXiv:1908.04722

  33. [33]

    Search for electroweak production of charginos and neutralinos at√s=13 TeV in final states containing hadronic decays of WW, WZ, or WH and missing transverse momentum

    CMS Collaboration, “Search for electroweak production of charginos and neutralinos at√s=13 TeV in final states containing hadronic decays of WW, WZ, or WH and missing transverse momentum”,Phys. Lett. B842(2023) 137460, doi:10.1016/j.physletb.2022.137460,arXiv:2205.09597

  34. [34]

    Agostinelli, et al., Nucl

    GEANT4 Collaboration, “GEANT4—a simulation toolkit”,Nucl. Instrum. Meth. A506 (2003) 250,doi:10.1016/S0168-9002(03)01368-8

  35. [35]

    Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

    S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift”, 2015.arXiv:1502.03167

  36. [36]

    Chollet et al., “Keras”.https://keras.io, 2015

    F. Chollet et al., “Keras”.https://keras.io, 2015. References 23

  37. [37]

    TensorFlow: A system for large-scale machine learning

    M. Abadi et al., “TensorFlow: A system for large-scale machine learning”, 2016. arXiv:1605.08695

  38. [38]

    Decoupled Weight Decay Regularization

    I. Loshchilov and F. Hutter, “Decoupled weight decay regularization”, 2017. arXiv:1711.05101

  39. [39]

    Measurement of the transverse momentum spectra of weak vector bosons produced in proton-proton collisions at sqrt(s) = 8 TeV

    CMS Collaboration, “Measurement of the transverse momentum spectra of weak vector bosons produced in proton-proton collisions at √s=8 TeV”,JHEP02(2017) 096, doi:10.1007/JHEP02(2017)096,arXiv:1606.05864

  40. [40]

    High-precision measurement of the W boson mass with the CMS experiment

    CMS Collaboration, “High-precision measurement of the W boson mass with the CMS experiment at the LHC”, 2024.arXiv:2412.13872. Submitted toNature

  41. [41]

    CMS physics: Technical design report volume 1: Detector performance and software

    CMS Collaboration, “CMS physics: Technical design report volume 1: Detector performance and software”, CMS Technical Design Report CERN-LHCC-2006-001, CMS-TDR-8-1, 2006

  42. [42]

    CMSSW on Github

    CMS Collaboration, “CMSSW on Github”. Accessed: 2023-11-08. http://cms-sw.github.io/

  43. [43]

    Portable acceleration of CMS computing workflows with coprocessors as a service

    CMS Collaboration, “Portable acceleration of CMS computing workflows with coprocessors as a service”,Comput. Softw. Big Sci.8(2024) 17, doi:10.1007/s41781-024-00124-1,arXiv:2402.15366. 24 25 A The CMS Collaboration Yerevan Physics Institute, Yerevan, Armenia A. Hayrapetyan, V . Makarenko , A. Tumasyan1 Institut f ¨ ur Hochenergiephysik, Vienna, Austria W....