Deep Neural Networks for Heavy Lepton-Flavor-Violating Higgs Searches at the LHC
Pith reviewed 2026-05-22 06:17 UTC · model grok-4.3
The pith
Deep neural networks trained on kinematic variables reduce expected upper limits on heavy LFV Higgs cross sections by 36-46% compared to the collinear mass baseline.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors recast a CMS search for H to mu tau in the Type-III 2HDM using fast simulation and show that a DNN classifier with optimized thresholds reduces the expected 95% CL upper limits on the signal cross section by 42-46% in the 0-jet channel and 36-40% in the 1-jet channel compared to the M_col baseline. They identify m_vis as a key feature via SHAP analysis and demonstrate that a mass-dependent pre-selection on m_vis improves sensitivity, while a DNN regression corrects the collinear approximation bias and enhances mass resolution.
What carries the argument
A deep neural network classifier trained on final-state kinematic variables, combined with mass-dependent threshold optimization, that discriminates signal from background more effectively than the collinear mass M_col.
If this is right
- The expected 95% CL upper limits on signal cross sections tighten by 36-46% depending on jet multiplicity.
- Visible mass m_vis emerges as a dominant discriminating variable reflecting tau decay neutrino momentum.
- A simplified pre-selection m_vis < f * m_H with f=0.7 or 0.8 improves sensitivity over M_col alone.
- The DNN regression model predicts m_H/M_col ratio, reducing mass prediction bias to below 1 GeV and improving resolution by 12-21%.
Where Pith is reading between the lines
- Applying this DNN approach to full detector simulations or real data could further validate or enhance the gains seen in fast simulation.
- Similar machine learning techniques might benefit other flavor-violating or rare Higgs decay searches at the LHC.
- Combining the simplified pre-selection with the full DNN could offer an accessible way for experiments to improve limits without major computational overhead.
Load-bearing premise
The fast detector simulation and training samples accurately capture the kinematic distributions and background compositions present in real LHC collision data for the H to mu tau channel.
What would settle it
Applying the DNN analysis to actual LHC collision data and finding no improvement in expected limits due to mismatches in real detector response or background modeling.
Figures
read the original abstract
We study lepton-flavor-violating (LFV) decays of a heavy Higgs boson, $H \to \mu\tau$, in the Type-III two-Higgs-doublet model by recasting the CMS search at $\sqrt{s} = 13$ TeV with 35.9 fb$^{-1}$ using fast detector simulation in the mass range 200-450 GeV. We develop a deep neural network (DNN) classifier trained on final-state kinematic variables that, with mass-dependent threshold optimization, reduces the expected 95% CL upper limits on the signal cross section by 42-46% in the 0-jet channel and 36-40% in the 1-jet channel relative to the standard collinear mass ($M_\mathrm{col}$) baseline. We apply SHAP interpretability analysis to identify the visible mass $m_\mathrm{vis}$ as one of the dominant discriminating feature, reflecting the characteristic neutrino momentum fraction of the $\tau$ decay. We show that supplementing the $M_\mathrm{col}$ analysis with a simplified mass-dependent pre-selection, $m_\mathrm{vis} < f \cdot m_H$ with $f = 0.7$ (0-jet) and $f = 0.8$ (1-jet), consistently improves the sensitivity over the $M_\mathrm{col}$-only baseline without requiring multivariate infrastructure. In addition, a DNN regression model trained to predict the ratio $m_H/M_\mathrm{col}$ corrects the systematic prediction bias inherent in the collinear approximation, maintaining an absolute mass prediction error below 1 GeV for signals up to 400 GeV and improving the mass resolution by 12% (0-jet) and 21% (1-jet) at $m_H = 450$ GeV. These results demonstrate a clear path toward significantly enhanced sensitivity in LFV Higgs searches at the LHC.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript recasts the CMS search for heavy Higgs LFV decays H → μτ at √s = 13 TeV with 35.9 fb⁻¹ using fast detector simulation in the 200-450 GeV mass range within the Type-III 2HDM. It introduces a DNN classifier trained on final-state kinematic variables with mass-dependent threshold optimization, claiming reductions in expected 95% CL upper limits on the signal cross section of 42-46% (0-jet) and 36-40% (1-jet) relative to the M_col baseline. The work also applies SHAP analysis to highlight m_vis as a key feature, proposes a simplified m_vis < f · m_H pre-selection, and presents a DNN regression model to correct collinear approximation bias while improving mass resolution by 12-21%.
Significance. If the performance gains survive transition to real data and are supported by proper ML validation, the DNN approach and regression correction could provide a practical enhancement to sensitivity in LFV Higgs searches at the LHC. The SHAP interpretability and simplified pre-selection offer additional methodological value by identifying physically motivated features and reducing reliance on full multivariate infrastructure.
major comments (2)
- Abstract and DNN classifier results: the claimed 42-46% (0-jet) and 36-40% (1-jet) reductions in expected 95% CL limits are presented as quantitative improvements from the DNN, yet the manuscript provides no information on training/validation splits, cross-validation procedure, overfitting diagnostics, or propagation of systematic uncertainties from the DNN output. These details are load-bearing for the central performance claim relative to the M_col baseline.
- Simulation and background modeling section: the headline sensitivity gains are obtained entirely within fast detector simulation. The manuscript does not quantify how well the fast simulation reproduces the kinematic correlations (e.g., among m_vis, pT^miss, and jet activity) and background compositions (Z→ττ, W+jets) that the DNN exploits, nor does it test robustness under data-driven background estimation methods used in the original CMS analysis.
minor comments (2)
- Notation: the pre-selection factor f is introduced with specific values (0.7 for 0-jet, 0.8 for 1-jet) but its optimization procedure and stability under variations in m_H are not detailed.
- The regression model reports absolute mass prediction error below 1 GeV up to 400 GeV; a table or figure showing the resolution improvement as a function of m_H would strengthen the presentation.
Simulated Author's Rebuttal
We thank the referee for their thorough review and valuable comments on our manuscript. We address each of the major comments below and have updated the manuscript accordingly to improve clarity and completeness.
read point-by-point responses
-
Referee: Abstract and DNN classifier results: the claimed 42-46% (0-jet) and 36-40% (1-jet) reductions in expected 95% CL limits are presented as quantitative improvements from the DNN, yet the manuscript provides no information on training/validation splits, cross-validation procedure, overfitting diagnostics, or propagation of systematic uncertainties from the DNN output. These details are load-bearing for the central performance claim relative to the M_col baseline.
Authors: We agree that providing these details is essential to substantiate the performance claims. In the revised version, we have expanded the 'DNN Classifier' section to include a description of the dataset splitting (80% training, 20% validation), the use of 5-fold cross-validation to assess stability, and overfitting checks through monitoring of training and validation losses as well as comparison of performance on independent test samples. For systematic uncertainties, we now discuss the inclusion of DNN output variations as nuisance parameters in the statistical analysis, derived from ensemble training with different random seeds, and their impact on the limit setting procedure. revision: yes
-
Referee: Simulation and background modeling section: the headline sensitivity gains are obtained entirely within fast detector simulation. The manuscript does not quantify how well the fast simulation reproduces the kinematic correlations (e.g., among m_vis, pT^miss, and jet activity) and background compositions (Z→ττ, W+jets) that the DNN exploits, nor does it test robustness under data-driven background estimation methods used in the original CMS analysis.
Authors: We recognize the importance of validating the fast simulation against more detailed modeling. We have added a new paragraph in the 'Simulation and Background Modeling' section that quantifies the agreement between fast simulation and full simulation for key variables used by the DNN, including m_vis, pT^miss, and jet pT distributions, with discrepancies below 8% in the signal regions. Background compositions are matched to those in the CMS analysis. However, we note that a full reproduction of the data-driven background estimation techniques from the original CMS paper would require access to the experimental data and is beyond the scope of this theoretical recast study. We have clarified in the text that the reported gains are within the simulation framework and suggest that the approach can be adapted to data-driven methods in future experimental implementations. revision: partial
- Complete testing of robustness under the specific data-driven background estimation methods of the original CMS analysis, which requires proprietary experimental data and is outside the scope of this simulation-based recast.
Circularity Check
No circularity: DNN performance gains measured against fixed external baseline on simulation
full rationale
The paper trains a DNN on final-state kinematic variables from fast detector simulation and reports expected 95% CL limit improvements relative to the standard collinear mass (M_col) method. This comparison is performed on the same simulated samples using a fixed, non-fitted baseline; no equation or procedure reduces the quoted 42-46% (0-jet) or 36-40% (1-jet) gains to a quantity defined by the DNN parameters themselves. No self-citations, uniqueness theorems, or ansatzes from prior author work are invoked as load-bearing steps. The mass-regression correction and SHAP analysis are likewise independent evaluations within the simulation framework. The derivation chain is therefore self-contained and does not collapse to its inputs by construction.
Axiom & Free-Parameter Ledger
free parameters (2)
- pre-selection factor f =
0.7 and 0.8
- DNN decision thresholds
axioms (2)
- domain assumption Type-III two-Higgs-doublet model permits H to mu tau decays at observable rates
- domain assumption Fast detector simulation reproduces real detector response for signal and background kinematics
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DNN classifier trained on final-state kinematic variables... mvis < f · mH ... DNN regression model trained to predict the ratio mH/Mcol
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
recasting the CMS search... fast detector simulation... 0-jet and 1-jet channels
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
ATLAS Collaboration, Phys. Lett. B716, 1 (2012), arXiv:1207.7214 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[2]
CMS Collaboration, Phys. Lett. B716, 30 (2012), arXiv:1207.7235 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[3]
G. C. Branco, P. M. Ferreira, L. Lavoura, M. N. Rebelo, M. Sher, and J. P. Silva, Phys. Rept.516, 1 (2012), arXiv:1106.0034 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[4]
T. D. Lee, Phys. Rev. D8, 1226 (1973)
work page 1973
-
[5]
Higgs to mu tau Decay in Supersymmetry without R Parity
A. Arhrib, Y.-W. Cheng, and O. C. W. Kong, EPL101, 31003 (2013), arXiv:1208.4669 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[6]
Higgs-induced lepton flavor violation
A. Goudelis, O. Lebedev, and J.-h. Park, Phys. Lett. B707, 369 (2012), arXiv:1111.1715 [hep-ph]. 20
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[7]
Testing Supersymmetry with Lepton Flavor Violating tau and mu decays
E. Arganda and M. J. Herrero, Phys. Rev. D73, 055003 (2006), arXiv:hep-ph/0510405
work page internal anchor Pith review Pith/arXiv arXiv 2006
-
[8]
K. Agashe and R. Contino, Phys. Rev. D80, 075016 (2009), arXiv:0906.1542 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[9]
Higgs Mediated FCNC's in Warped Extra Dimensions
A. Azatov, M. Toharia, and L. Zhu, Phys. Rev. D80, 035016 (2009), arXiv:0906.1990 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2009
-
[10]
A. J. Buras, B. Duling, and S. Gori, JHEP09, 076, arXiv:0905.2318 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv
-
[11]
R. Harnik, J. Kopp, and J. Zupan, JHEP03, 026, arXiv:1209.1397 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv
-
[12]
Y. Omura, E. Senaha, and K. Tobe, JHEP05, 028, arXiv:1502.07824 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv
-
[13]
Probing Lepton Flavor Violation at the 13 TeV LHC
R. Primulando and P. Uttayarat, JHEP05, 055, arXiv:1612.01644 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv
-
[14]
Aadet al.(ATLAS), JHEP07, 166, arXiv:2302.05225 [hep-ex]
G. Aadet al.(ATLAS), JHEP07, 166, arXiv:2302.05225 [hep-ex]
- [15]
-
[16]
CMS Collaboration, Phys. Lett. B749, 337 (2015), arXiv:1502.07400 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[17]
ATLAS Collaboration, Eur. Phys. J. C77, 70 (2017), arXiv:1508.03372 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[18]
A. Hayrapetyanet al.(CMS), Phys. Rev. D108, 072004 (2023), arXiv:2305.18106 [hep-ex]
-
[19]
G. Aadet al.(ATLAS), Phys. Lett. B801, 135148 (2020), arXiv:1909.10235 [hep-ex]
- [20]
-
[21]
P.Baldi,K.Cranmer,T.Faucett,P.Sadowski,andD.Whiteson,Eur.Phys.J.C76,235(2016),arXiv:1601.07913 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[22]
Deep Learning and its Application to LHC Physics
D. Guest, K. Cranmer, and D. Whiteson, Ann. Rev. Nucl. Part. Sci.68, 161 (2018), arXiv:1806.11484 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[23]
Machine Learning in High Energy Physics Community White Paper
K. Albertssonet al., J. Phys. Conf. Ser.1085, 022008 (2018), arXiv:1807.02876 [physics.comp-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [24]
-
[25]
Enhanced Higgs to $\tau^+\tau^-$ Searches with Deep Learning
P. Baldi, P. Sadowski, and D. Whiteson, Phys. Rev. Lett.114, 111801 (2015), arXiv:1410.3469 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[26]
Reconstruction of $\tau$ lepton pair invariant mass using an artificial neural network
P. Bärtschi, C. Galloni, C. Lange, and B. Kilminster, Nucl. Instrum. Meth. A929, 29 (2019), arXiv:1904.04924 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[27]
A. Tumasyanet al.(CMS), Eur. Phys. J. C83, 562 (2023), arXiv:2204.12957 [hep-ex]
-
[28]
A Unified Approach to Interpreting Model Predictions
S. Lundberg and S.-I. Lee, A unified approach to interpreting model predictions (2017), arXiv:1705.07874 [cs.AI]
work page internal anchor Pith review Pith/arXiv arXiv 2017
- [29]
-
[30]
J.deFavereau,C.Delaere,P.Demin,A.Giammanco,V.Lemaître,A.Mertens,andM.Selvaggi(DELPHES3), JHEP02, 057, arXiv:1307.6346 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv
-
[31]
J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao, T. Stelzer, P. Torrielli, and M. Zaro, JHEP07, 079, arXiv:1405.0301 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv
-
[32]
T. Sjöstrand, S. Ask, J. R. Christiansen, R. Corke, N. Desai, P. Ilten, S. Mrenna, S. Prestel, C. O. Rasmussen, and P. Z. Skands, Comput. Phys. Commun.191, 159 (2015), arXiv:1410.3012 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[33]
MadAnalysis 5, a user-friendly framework for collider phenomenology
E. Conte, B. Fuks, and G. Serret, Comput. Phys. Commun.184, 222 (2013), arXiv:1206.1599 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[34]
Designing and recasting LHC analyses with MadAnalysis 5
E. Conte, B. Dumont, B. Fuks, and C. Wymant, Eur. Phys. J. C74, 3103 (2014), arXiv:1405.3982 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[35]
TauDecay: a library to simulate polarized tau decays via FeynRules and MadGraph5
K. Hagiwara, T. Li, K. Mawatari, and J. Nakamura, Eur. Phys. J. C73, 2489 (2013), arXiv:1212.6247 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[36]
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, 21 S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Lev- enberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K.Talwar,P.Tucker,V.Vanhoucke...
work page 2015
-
[37]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, Journal of Machine Learning Re- search15, 1929 (2014)
work page 1929
-
[38]
S.IoffeandC.Szegedy,inProceedings of the 32nd International Conference on Machine Learning,Proceedings ofMachineLearningResearch,Vol.37,editedbyF.BachandD.Blei(PMLR,Lille,France,2015)pp.448–456
work page 2015
-
[39]
Asymptotic formulae for likelihood-based tests of new physics
G. Cowan, K. Cranmer, E. Gross, and O. Vitells, Eur. Phys. J. C71, 1554 (2011), [Erratum: Eur.Phys.J.C 73, 2501 (2013)], arXiv:1007.1727 [physics.data-an]. 22
work page internal anchor Pith review Pith/arXiv arXiv 2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.