Optimizing The Cut And Count Method In Phenomenological Studies
Pith reviewed 2026-05-24 08:26 UTC · model grok-4.3
The pith
An automated ranking scheme for observables optimizes cuts in the cut-and-count method, leading to enhanced discovery potential for new physics signals.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Automating the cut and count process using a ranking scheme to assess observable importance and a systematic way of choosing cuts results in an enhanced discovery potential compared with the more traditional way of imposing cuts, as shown in the context of a singly charged Higgs search in the Two Higgs Doublet Model.
What carries the argument
The ranking scheme that quantitatively assesses the relative importance of various observables involved in a new physics process.
If this is right
- The optimized cuts provide better separation of signal from background in BSM searches.
- This approach can be applied to any phenomenological study using cut-and-count methods.
- It works iteratively with MadAnalysis5 to refine the analysis.
- Enhanced discovery potential means higher chances of detecting new particles like the charged Higgs.
Where Pith is reading between the lines
- This could minimize subjective choices in cut selection across different analyses.
- Testing the method on processes with known signals would verify if the improvements are robust.
- The technique might complement machine learning approaches in future collider studies.
Load-bearing premise
The ranking scheme that quantitatively assesses the relative importance of observables produces cuts that genuinely improve signal significance rather than merely fitting statistical fluctuations in the simulated samples.
What would settle it
A direct comparison of the signal significance achieved with the optimized cuts versus traditional cuts on the same Monte Carlo samples for the 2HDM charged Higgs search, or validation on a well-understood Standard Model process.
Figures
read the original abstract
We introduce an optimization technique to discriminate signal and background in any phenomeno- logical study based on the cut and count-based method. The core ideas behind this technique are the introduction of a ranking scheme that can quantitatively assess the relative importance of var- ious observables involved in a new physics process, and a more methodical way of choosing what cuts to impose. The technique is an iterative process that works with the help of the MadAnalysis5 interface. Working in the context of a BSM (Beyond Standard Model) scenario where we carry out a signal search of singly charged Higgs in the context of the Two Higgs Doublet Model (2HDM), we demonstrate how automating the cut and count process in this specific way results in an enhanced discovery potential compared with the more traditional way of imposing cuts.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces an iterative optimization technique for cut-and-count analyses in BSM phenomenology. It defines a ranking scheme to quantify the relative importance of observables and uses MadAnalysis5 to select cuts in a methodical, automated manner. The approach is demonstrated in a search for singly charged Higgs bosons within the Two Higgs Doublet Model, where the authors claim it yields an enhanced discovery potential relative to traditional manual cut imposition.
Significance. If the reported improvements prove robust against statistical fluctuations in the Monte Carlo samples, the technique could provide a useful systematization of cut selection for phenomenological studies, reducing reliance on ad-hoc choices while interfacing with existing public tools. The integration with MadAnalysis5 supports reproducibility, though the absence of explicit validation metrics limits assessment of its broader impact.
major comments (2)
- [Abstract] Abstract and method description: the central claim of 'enhanced discovery potential' is not supported by any quantitative details on the ranking metric, number of iterations, or the magnitude of improvement in S/√B. Without these, the result cannot be evaluated for load-bearing significance.
- [Method] Method and results sections: the iterative ranking and cut selection, as well as the final significance evaluation, are performed on the identical finite Monte Carlo samples with no mention of hold-out sets, k-fold cross-validation, or independent test samples. This directly undermines the claim that the procedure produces genuinely superior cuts rather than fits to sample-specific noise, as the stopping criterion itself is significance-based.
minor comments (1)
- [Abstract] The abstract would benefit from a brief statement of the specific 2HDM parameter point or benchmark used for the demonstration.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for the constructive comments. We address each major point below and indicate the changes planned for the revised version.
read point-by-point responses
-
Referee: [Abstract] Abstract and method description: the central claim of 'enhanced discovery potential' is not supported by any quantitative details on the ranking metric, number of iterations, or the magnitude of improvement in S/√B. Without these, the result cannot be evaluated for load-bearing significance.
Authors: We agree that the abstract would benefit from explicit quantitative support for the claimed improvement. In the revised manuscript we will insert the specific values of the ranking metric, the number of iterations required for convergence, and the achieved gain in S/√B relative to the manual-cut baseline. These numbers are already present in the results section and can be moved to the abstract without altering the underlying analysis. revision: yes
-
Referee: [Method] Method and results sections: the iterative ranking and cut selection, as well as the final significance evaluation, are performed on the identical finite Monte Carlo samples with no mention of hold-out sets, k-fold cross-validation, or independent test samples. This directly undermines the claim that the procedure produces genuinely superior cuts rather than fits to sample-specific noise, as the stopping criterion itself is significance-based.
Authors: The referee correctly notes that the optimization and final significance evaluation use the same Monte Carlo samples and that no cross-validation or hold-out procedure is described. This is a genuine methodological limitation that could allow the procedure to fit statistical fluctuations. In the revised manuscript we will add an explicit discussion of this issue, report results obtained on an independent test sample generated with different random seeds, and include a brief stability check by repeating the ranking on subsamples. These additions will be presented as a new subsection in the results. revision: yes
Circularity Check
No significant circularity; method is self-contained procedure using external tool
full rationale
The paper introduces a ranking scheme and iterative cut-selection procedure that interfaces with the public MadAnalysis5 tool. No equations, parameters, or results are defined in terms of each other or reduced by construction to fitted inputs from the same samples. The demonstration in the 2HDM example compares the automated cuts against traditional ones on the same Monte Carlo samples, but this is an empirical comparison rather than a definitional loop. No self-citation load-bearing steps, uniqueness theorems, or ansatzes smuggled via prior work appear in the provided text. The derivation chain is therefore independent of its own outputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
PT (ℓ1), 2. PT (ℓ2), 3. PT (b1), 4. PT (b2), 5. PT (b3), 6. PT (b4), 7. η(ℓ1), 8. η(ℓ2), 9. η(b1), 10. η(b2), 11. η(b3),
-
[2]
η(b4), 13. ∆R(ℓ1, ℓ2), 14. ∆R(b1, b2), 15. ∆R(b1, b3), 16. ∆R(b1, b4), 17. ∆R(b2, b3), 18. ∆R(b2, b4), 19. ∆R(b3, b4),
-
[3]
T HT , 21. ET , 22. M(ℓ1, ℓ2), 23. M(b1, b2), 24. M(b1, b3), 25. M(b1, b4), 26. M(b2, b3), 27. M(b2, b4), 28. M(b3, b4),
-
[4]
M(ℓ1, ℓ2, b1, b2, b3, b4). The initial distributions acquired following the application of preselection cuts are depicted in Figure 1. By examin- ing these distributions, one can straightforwardly determine the selection cuts that maximizes the signal-to-background ratio. We have established a set of selection criteria from intution gained from the initia...
-
[5]
How does one choose the kinematic variable that will maximally aid S vs B?
-
[6]
Having identified the variable, how does one choose the exact cut that will maximally isolate the signal?
-
[7]
How does one continue the process taking care to ensure the significance increases with every step? We will begin by introducing relevant quantities of interest that will simultaneously answer these questions. 6 0 200 400 600 800 1000 pT [l1] (GeV/c) 0.00 0.02 0.04 0.06 0.08 0.10 0.12 0.14 Signal Background 0 200 400 600 800 1000 pT [l2] (GeV/c) 0.00 0.05...
work page 2000
-
[8]
The generated signal and background events (after the imposition of the preselection cuts) are fed into Mad- Analysis 5 and the distributions of the various kinematic obsevables are obtained
-
[9]
The Area Parameter is calculated for all observables and this is then used to sort them. The observable with the highest rank is passed on to the stage of vertical line test that enables us to come up with the optimal cuts
-
[10]
Observable hold: This representation showcases where we hold the remaining observable distributions for future iterations
-
[11]
If, after the imposition of the cut, enough signal events remain ( Ns > 10) and if an improvement in significance is obtained, the cut is accepted and the observable is dropped from further consideration. In addition, if after this cut, we satisfy the LL condition (i.e., significance becomes 5 σ or higher), the process terminates
-
[12]
If, after the imposition of the cut the LL condition is not satisfied, we pass on to the Collector Connector (CC): It takes the observable distributions from the hold and pushes it to the next step. Once a proper instruction (indicated by an arrow) hits the CC, it will collect all the observable distribution sets from the hold connected to it and then rec...
work page 2000
-
[13]
Otherwise, it remains inactive
Pulse Switch (PS): This is an instantaneous switch that triggers the execution of an instruction when a specific condition is met (typically an ‘if’ condition in the program to check the minimum significance criterion). Otherwise, it remains inactive. Specifically, if it turns out that σ(k) < σ (k − 1) + 0.10, then that particular observable will be sent ...
-
[14]
The same steps continue until we either run out of observables for ranking or when the LL conditions are satisfied, i.e., the significance improved beyond 5 σ. The suggested cuts on following the algorithm at each iteration are shown in Figure 7 and the final cut flow chart is shown in Table IV. At this point, it would be reasonable to ask if an iterative...
work page 2088
-
[15]
Partial Symmetries of Weak Interactions,
S. L. Glashow, “Partial Symmetries of Weak Interactions,” Nucl. Phys. 22, 579–588 (1961)
work page 1961
-
[16]
Steven Weinberg, “A Model of Leptons,” Phys. Rev. Lett. 19, 1264–1266 (1967)
work page 1967
-
[17]
Weak and Electromagnetic Interactions,
Abdus Salam, “Weak and Electromagnetic Interactions,” Conf. Proc. C 680519, 367–377 (1968)
work page 1968
-
[18]
Deep Learning and its Application to LHC Physics
Dan Guest, Kyle Cranmer, and Daniel Whiteson, “Deep Learning and its Application to LHC Physics,” Ann. Rev. Nucl. Part. Sci. 68, 161–181 (2018), arXiv:1806.11484 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[19]
Jet-Images -- Deep Learning Edition
Luke de Oliveira, Michael Kagan, Lester Mackey, Benjamin Nachman, and Ariel Schwartzman, “Jet-images — deep learning edition,” JHEP 07, 069 (2016), arXiv:1511.05190 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[20]
J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao, T. Stelzer, P. Torrielli, and M. Zaro, “The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations,” JHEP 07, 079 (2014), arXiv:1405.0301 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[21]
FeynRules 2.0 - A complete toolbox for tree-level phenomenology
Adam Alloul, Neil D. Christensen, C´ eline Degrande, Claude Duhr, and Benjamin Fuks, “FeynRules 2.0 - A complete toolbox for tree-level phenomenology,” Comput. Phys. Commun. 185, 2250–2300 (2014), arXiv:1310.1921 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[22]
Torbj¨ orn Sj¨ ostrand, Stefan Ask, Jesper R. Christiansen, Richard Corke, Nishita Desai, Philip Ilten, Stephen Mrenna, Stefan Prestel, Christine O. Rasmussen, and Peter Z. Skands, “An introduction to PYTHIA 8.2,” Comput. Phys. Commun. 191, 159–177 (2015), arXiv:1410.3012 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[23]
DELPHES 3, A modular framework for fast simulation of a generic collider experiment
J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. Lemaˆ ıtre, A. Mertens, and M. Selvaggi (DELPHES 3), “DELPHES 3, A modular framework for fast simulation of a generic collider experiment,” JHEP 02, 057 (2014), arXiv:1307.6346 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[24]
MadAnalysis 5, a user-friendly framework for collider phenomenology
Eric Conte, Benjamin Fuks, and Guillaume Serret, “MadAnalysis 5, A User-Friendly Framework for Collider Phenomenol- ogy,” Comput. Phys. Commun. 184, 222–256 (2013), arXiv:1206.1599 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[25]
Approximating Likelihood Ratios with Calibrated Discriminative Classifiers
Kyle Cranmer, Juan Pavez, and Gilles Louppe, “Approximating Likelihood Ratios with Calibrated Discriminative Clas- sifiers,” (2015), arXiv:1506.02169 [stat.AP]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[26]
Parameterized Machine Learning for High-Energy Physics
Pierre Baldi, Kyle Cranmer, Taylor Faucett, Peter Sadowski, and Daniel Whiteson, “Parameterized neural networks for high-energy physics,” Eur. Phys. J. C 76, 235 (2016), arXiv:1601.07913 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[27]
Interpretable deep learning for two-prong jet classification with jet spectra,
Amit Chakraborty, Sung Hak Lim, and Mihoko M. Nojiri, “Interpretable deep learning for two-prong jet classification with jet spectra,” JHEP 07, 135 (2019), arXiv:1904.02092 [hep-ph]
-
[28]
Mapping Machine-Learned Physics into a Human-Readable Space,
Taylor Faucett, Jesse Thaler, and Daniel Whiteson, “Mapping Machine-Learned Physics into a Human-Readable Space,” Phys. Rev. D 103, 036020 (2021), arXiv:2010.11998 [hep-ph]. 15
-
[29]
Uncertainty-aware machine learning for high energy physics,
Aishik Ghosh, Benjamin Nachman, and Daniel Whiteson, “Uncertainty-aware machine learning for high energy physics,” Phys. Rev. D 104, 056026 (2021), arXiv:2105.08742 [physics.data-an]
-
[30]
Deep- Learning Jets with Uncertainties and More,
Sven Bollweg, Manuel Haußmann, Gregor Kasieczka, Michel Luchmann, Tilman Plehn, and Jennifer Thompson, “Deep- Learning Jets with Uncertainties and More,” SciPost Phys. 8, 006 (2020), arXiv:1904.10004 [hep-ph]
-
[31]
Theory and phenomenology of two-Higgs-doublet models
G. C. Branco, P. M. Ferreira, L. Lavoura, M. N. Rebelo, Marc Sher, and Joao P. Silva, “Theory and phenomenology of two-Higgs-doublet models,” Phys. Rept. 516, 1–102 (2012), arXiv:1106.0034 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[32]
Global fit of the Aligned Two-Higgs-Doublet Model,
Anirban Karan, V´ ıctor Miralles, and Antonio Pich, “Global fit of the Aligned Two-Higgs-Doublet Model,” in2023 European Physical Society Conference on High Energy Physics (2023) arXiv:2312.00514 [hep-ph]
-
[33]
TASI 2013 lectures on Higgs physics within and beyond the Standard Model,
Heather E. Logan, “TASI 2013 lectures on Higgs physics within and beyond the Standard Model,” (2014), arXiv:1406.1786 [hep-ph]
-
[34]
Charged Higgs decay to W ±H at a high energy lepton collider,
Majid Hashemi and Laleh Roushandel, “Charged Higgs decay to W ±H at a high energy lepton collider,” (2023), arXiv:2310.06519 [hep-ph]
-
[35]
Georges Aad et al. (ATLAS), “Search for charged Higgs bosons decaying via H ± → τ ±ν in fully hadronic final states using pp collision data at √s = 8 TeV with the ATLAS detector,” JHEP 03, 088 (2015), arXiv:1412.6663 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[36]
Search for a charged Higgs boson in pp collisions at sqrt(s) = 8 TeV
Vardan Khachatryan et al. (CMS), “Search for a charged Higgs boson in pp collisions at √s = 8 TeV,” JHEP 11, 018 (2015), arXiv:1508.07774 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[37]
Georges Aad et al. (ATLAS), “Search for charged Higgs bosons in the H ± → tb decay channel in pp collisions at √s = 8 TeV using the ATLAS detector,” JHEP 03, 127 (2016), arXiv:1512.03704 [hep-ex]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[38]
Update of Global Two-Higgs-Doublet Model Fits
Debtosh Chowdhury and Otto Eberhardt, “Update of Global Two-Higgs-Doublet Model Fits,” JHEP 05, 161 (2018), arXiv:1711.02095 [hep-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[39]
Asymptotic formulae for likelihood-based tests of new physics
Glen Cowan, Kyle Cranmer, Eilam Gross, and Ofer Vitells, “Asymptotic formulae for likelihood-based tests of new physics,” Eur. Phys. J. C 71, 1554 (2011), [Erratum: Eur.Phys.J.C 73, 2501 (2013)], arXiv:1007.1727 [physics.data-an]
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[40]
Li-Gang Xia, “QBDT, a new boosting decision tree method with systematical uncertainties into training for High Energy Physics,” Nucl. Instrum. Meth. A 930, 15–26 (2019), arXiv:1810.08387 [physics.data-an]
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[41]
On a measure of divergence between two multinomial populations,
A. Bhattacharyya, “On a measure of divergence between two multinomial populations,” Sankhy¯ a: The Indian Journal of Statistics (1933-1960) 7, 401–406 (1946)
work page 1933
-
[42]
On information and sufficiency,
S. Kullback and R. A. Leibler, “On information and sufficiency,” The Annals of Mathematical Statistics 22, 79–86 (1951)
work page 1951
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.