Defining a Minimum Resolution for Unbinned Analyses
Pith reviewed 2026-06-30 00:43 UTC · model grok-4.3
The pith
The Minimum Resolution Likelihood method defines a Fiducial Signal Region that converts systematic effects from machine learning background estimation into statistical uncertainties.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present the Minimum Resolution Likelihood (MRL) method, which defines a Fiducial Signal Region that effectively turns the systematic effects into statistical uncertainties. We show with examples that the resulting signal strength estimation is either unbiased or consistent with zero. We consider both toy examples and a realistic application based on the HI-SIGMA technique applied to di-Higgs searches.
What carries the argument
The Minimum Resolution Likelihood (MRL) method, which defines a Fiducial Signal Region to convert ML-induced systematic effects from background estimation into statistical uncertainties.
If this is right
- Signal strength estimation remains unbiased when the MRL-defined region is applied.
- When no signal is present, the estimated strength is consistent with zero.
- Systematic effects from machine learning background models become statistical uncertainties.
- The method works for both simplified toy cases and full di-Higgs searches using HI-SIGMA.
Where Pith is reading between the lines
- The approach could extend to other machine-learning-driven searches in high-energy physics beyond the di-Higgs channel.
- It might reduce reliance on separate systematic uncertainty modeling in future unbinned analyses.
- Further validation in additional final states would test whether the conversion from systematics to statistics holds more broadly.
Load-bearing premise
Defining a Fiducial Signal Region via the Minimum Resolution Likelihood method can systematically convert all relevant systematic effects arising from ML background estimation into statistical uncertainties without introducing new unaccounted biases or selection effects.
What would settle it
A simulation or data analysis after MRL application in which the extracted signal strength exhibits a bias exceeding the size of the converted statistical uncertainties.
Figures
read the original abstract
Collider analyses combine rigorous statistical techniques with state-of-the-art Machine Learning models. However, when the latter are used directly to estimate the likelihood function of the background, hard to quantify systematic effects may bias the estimation of the relevant signal parameters. To address this problem, we present the Minimum Resolution Likelihood (MRL) method, which defines a Fiducial Signal Region that effectively turns the systematic effects into statistical uncertainties. We show with examples that the resulting signal strength estimation is either unbiased or consistent with zero. We consider both toy examples and a realistic application based on the HI-SIGMA technique applied to di-Higgs searches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the Minimum Resolution Likelihood (MRL) method, which defines a Fiducial Signal Region to convert systematic effects arising from machine-learning estimates of background likelihoods into statistical uncertainties in unbinned collider analyses. The central claim is that the resulting signal-strength estimation is either unbiased or consistent with zero, supported by toy-model examples and one realistic application to di-Higgs searches via the HI-SIGMA technique.
Significance. If the MRL construction can be shown to avoid new selection biases in general, the approach would offer a practical way to mitigate hard-to-quantify ML systematics in unbinned fits, strengthening the reliability of signal-parameter extraction. The provided examples supply initial evidence, but the absence of a general derivation or exhaustive validation against correlated error modes reduces the assessed impact.
major comments (2)
- [Abstract] Abstract: the assertion that signal-strength estimation is unbiased or consistent with zero is presented without derivation, validation details, or quantitative checks, leaving open whether the toy models and single application actually support the claim against ML systematics.
- [Toy examples and HI-SIGMA application] Toy examples and HI-SIGMA application: the demonstrations are confined to specific cases; no exhaustive scan over ML error modes (e.g., shape distortions correlated with signal-like kinematics) is reported, so the assumption that the fiducial-region definition introduces no new selection biases remains untested and load-bearing for the unbiased claim.
minor comments (1)
- The term 'Fiducial Signal Region' is introduced as a new construct; a brief comparison to existing fiducial definitions in the literature would improve clarity.
Simulated Author's Rebuttal
We thank the referee for their constructive review and for identifying areas where the presentation of our results can be strengthened. We respond to each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that signal-strength estimation is unbiased or consistent with zero is presented without derivation, validation details, or quantitative checks, leaving open whether the toy models and single application actually support the claim against ML systematics.
Authors: The abstract summarizes the empirical findings reported in Sections 3 and 4. While no general analytic derivation is provided, the toy-model studies and HI-SIGMA application contain explicit quantitative checks (bias measurements, pull distributions, and coverage tests) demonstrating that the extracted signal strength is unbiased or consistent with zero under the tested ML-induced systematics. We will revise the abstract to state explicitly that the property is shown empirically in the presented examples and to reference the relevant validation sections. revision: yes
-
Referee: [Toy examples and HI-SIGMA application] Toy examples and HI-SIGMA application: the demonstrations are confined to specific cases; no exhaustive scan over ML error modes (e.g., shape distortions correlated with signal-like kinematics) is reported, so the assumption that the fiducial-region definition introduces no new selection biases remains untested and load-bearing for the unbiased claim.
Authors: We agree that the validation is limited to the specific toy configurations and the single realistic di-Higgs application described. An exhaustive scan over all conceivable ML error modes, including arbitrary correlated shape distortions, lies outside the scope of this work. The MRL construction is intended to convert ML systematics into statistical uncertainties by construction of the fiducial region; the examples illustrate this mechanism but do not constitute a proof against every possible bias. We will add a dedicated limitations paragraph discussing the representativeness of the tested cases and the possibility of residual selection effects, while retaining the claim that no new biases were observed in the reported studies. revision: partial
Circularity Check
No circularity: MRL method and unbiasedness shown via examples, not by construction
full rationale
The paper introduces the Minimum Resolution Likelihood (MRL) method as a definition for a fiducial signal region that converts ML background systematics into statistical uncertainties. It then validates the resulting signal strength estimation (unbiased or consistent with zero) through explicit toy examples and one HI-SIGMA di-Higgs application. No load-bearing step reduces by the paper's own equations to a fitted input renamed as prediction, a self-referential definition, or a self-citation chain; the central claim rests on empirical demonstration rather than algebraic identity or imported uniqueness theorems. This is the most common honest outcome for a methods paper whose result is externally falsifiable via the provided examples.
Axiom & Free-Parameter Ledger
invented entities (1)
-
Fiducial Signal Region
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Behnke, K
O. Behnke, K. Kröninger, T. Schörner-Sadenius and G. Schott, eds.,Data analysis in high energy physics: A practical guide to statistical methods, Wiley-VCH, Weinheim, Germany, ISBN 978-3-527-41058-3, 978-3-527-65344-7, 978-3-527-65343-0 (2013)
2013
-
[2]
M. Aaboudet al.,Search for pair production of Higgs bosons in theb¯bb¯bfinal state using proton-proton collisions at√s= 13TeV with the ATLAS detector, JHEP01, 030 (2019), doi:10.1007/JHEP01(2019)030,1804.06174
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/jhep01(2019)030 2019
-
[3]
A. Hayrapetyanet al.,Search for ZZ and ZH production in the b¯bb¯b final state us- ing proton-proton collisions at √s= 13TeV, Eur. Phys. J. C84(7), 712 (2024), doi:10.1140/epjc/s10052-024-13021-z,2403.20241. [4]Improved results on Higgs boson pair production in the 4b final state(2025). 20 SciPost Physics Submission
-
[4]
A. Hallin, J. Isaacson, G. Kasieczka, C. Krause, B. Nachman, T. Quadfasel, M. Schlaf- fer, D. Shih and M. Sommerhalder,Classifying anomalies through outer density es- timation, Phys. Rev. D106(5), 055006 (2022), doi:10.1103/PhysRevD.106.055006, 2109.00546
-
[5]
J. A. Raine, S. Klein, D. Sengupta and T. Golling,CURTAINs for your Sliding Win- dow: Constructing Unobserved Regions by Transforming Adjacent Intervals, Front.Big Data6, 899345 (2022), doi:10.3389/fdata.2023.899345,2203.09470
-
[6]
T. Golling, S. Klein, R. Mastandrea and B. Nachman,Flow-enhanced transportation for anomaly detection, Phys. Rev. D107(9), 096025 (2023), doi:10.1103/PhysRevD.107.096025,2212.11285
-
[7]
A. Hallin, G. Kasieczka, T. Quadfasel, D. Shih and M. Sommerhalder,Resonant anomaly detection without background sculpting, Phys.Rev.D107(11), 114012(2023), doi:10.1103/PhysRevD.107.114012,2210.14924
-
[8]
D. Sengupta, S. Klein, J. A. Raine and T. Golling,CURTAINs flows for flows: Constructing unobserved regions with maximum likelihood estimation, SciPost Phys. 17(2), 046 (2024), doi:10.21468/SciPostPhys.17.2.046,2305.04646
-
[9]
R. Das, G. Kasieczka and D. Shih,Residual ANODE(2023),2312.11629. [11]Recommendations for the Modeling of Smooth Backgrounds(2020)
arXiv 2023
-
[10]
P. D. Dauncey, M. Kenzie, N. Wardle and G. J. Davies,Handling uncertainties in background shapes: the discrete profiling method, JINST10(04), P04015 (2015), doi:10.1088/1748-0221/10/04/P04015,1408.6865
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1088/1748-0221/10/04/p04015 2015
-
[11]
M. Haußmann, R. Winterhalder and M. Ubiali,Uncertainty in Physics and AI: Tax- onomy, Quantification, and Validation(2026),2605.10378
Pith/arXiv arXiv 2026
-
[12]
Constraining Effective Field Theories with Machine Learning
J. Brehmer, K. Cranmer, G. Louppe and J. Pavez,Constraining Effective Field Theories with Machine Learning, Phys. Rev. Lett.121(11), 111801 (2018), doi:10.1103/PhysRevLett.121.111801,1805.00013
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1103/physrevlett.121.111801 2018
-
[13]
A Guide to Constraining Effective Field Theories with Machine Learning
J. Brehmer, K. Cranmer, G. Louppe and J. Pavez,A Guide to Constraining Ef- fective Field Theories with Machine Learning, Phys. Rev. D98(5), 052004 (2018), doi:10.1103/PhysRevD.98.052004,1805.00020
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1103/physrevd.98.052004 2018
-
[14]
2020, Proceedings of the National Academy of Sciences, 117, 48, 30055
K. Cranmer, J. Brehmer and G. Louppe,The frontier of simulation-based inference, Proceedings of the National Academy of Sciences117(48), 30055 (2020), doi:10.1073/pnas.1912789117,https://www.pnas.org/doi/pdf/10.1073/ pnas.1912789117
-
[15]
171–176 (2020)
A.Ghosh,Measuring quantum interference in the off-shell Higgs to four leptons process with Machine Learning, pp. 171–176 (2020)
2020
-
[16]
Unbinned multivariate observables for global SMEFT analyses from machine learning.JHEP, 03:033, 2023
R. Gomez Ambrosio, J. ter Hoeve, M. Madigan, J. Rojo and V. Sanz,Unbinned multivariate observables for global SMEFT analyses from machine learning, JHEP 03, 033 (2023), doi:10.1007/JHEP03(2023)033,2211.02058
-
[17]
H. Bahl and S. Brass,ConstrainingCP-violation in the Higgs-top-quark interaction using machine-learning-based inference, JHEP03, 017 (2022), doi:10.1007/JHEP03(2022)017,2110.10177. 21 SciPost Physics Submission
-
[18]
R. Barrué, P. Conde-Muíño, V. Dao and R. Santos,Simulation-based inference in the search for CP violation in leptonic WH production, JHEP04, 014 (2024), doi:10.1007/JHEP04(2024)014,2308.02882
-
[19]
Refinable modeling for unbinned SMEFT analyses.Mach
R. Schöfbeck,Refinable modeling for unbinned SMEFT analyses, Mach. Learn. Sci. Tech.6(1), 015007 (2025), doi:10.1088/2632-2153/ad9fd1,2406.19076
-
[20]
S. Chai, J. Gu and L. Li,From optimal observables to machine learning: an effective- field-theory analysis of e+e−→W +W− at future lepton colliders, JHEP05, 292(2024), doi:10.1007/JHEP05(2024)292,2401.02474
-
[21]
R. Mastandrea, B. Nachman and T. Plehn,Constraining the Higgs potential with neu- ral simulation-based inference for di-Higgs production, Phys. Rev. D110(5), 056004 (2024), doi:10.1103/PhysRevD.110.056004,2405.15847
- [22]
-
[23]
S. Diefenbacher, S. Palacios Schweitzer and G. Kasieczka,Generative Models and Statistical Validation(2026),2605.30453
Pith/arXiv arXiv 2026
-
[24]
Data-Driven High-Dimensional Statistical Inference with Generative Models.JHEP, 11:129, 2025
O. Amram and M. Szewc,Data-driven high-dimensional statistical inference with generative models, JHEP11, 129 (2025), doi:10.1007/JHEP11(2025)129,2506.06438
-
[25]
Practical Statistics for the LHC
K. Cranmer,Practical Statistics for the LHC, In2011 European School of High-Energy Physics, pp. 267–308, doi:10.5170/CERN-2014-003.267 (2014),1503.07622
work page internal anchor Pith review Pith/arXiv arXiv doi:10.5170/cern-2014-003.267 2014
-
[26]
Buchner,Nested sampling methods, Statistics Surveys17(none) (2023), doi:10.1214/23-ss144
J. Buchner,Nested sampling methods, Statistics Surveys17(none) (2023), doi:10.1214/23-ss144
-
[27]
A. L. Read,Linear interpolation of histograms, Nucl. Instrum. Meth. A425, 357 (1999), doi:10.1016/S0168-9002(98)01347-3
-
[28]
Cranmer, G
K. Cranmer, G. Lewis, L. Moneta, A. Shibata and W. Verkerke,HistFactory: A tool for creating statistical models for use with RooFit and RooStats(2012)
2012
-
[29]
H. Dembinski and P. O. et al.,scikit-hep/iminuit(2020), doi:10.5281/zenodo.3949207
-
[30]
S. Dawsonet al.,Report of the Topical Group on Higgs Physics for Snowmass 2021: The Case for Precision Higgs Physics, InSnowmass 2021(2022),2209.07510
arXiv 2021
-
[31]
J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao, T. Stelzer, P. Torrielli and M. Zaro,The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations, JHEP07, 079 (2014), doi:10.1007/JHEP07(2014)079,1405.0301
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/jhep07(2014)079 2014
-
[32]
T. Sjöstrand, S. Ask, J. R. Christiansen, R. Corke, N. Desai, P. Ilten, S. Mrenna, S. Prestel, C. O. Rasmussen and P. Z. Skands,An introduction to PYTHIA 8.2, Comput. Phys. Commun.191, 159 (2015), doi:10.1016/j.cpc.2015.01.024,1410.3012
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1016/j.cpc.2015.01.024 2015
-
[33]
A comprehensive guide to the physics and usage of PYTHIA 8.3
C. Bierlichet al.,A comprehensive guide to the physics and usage of PYTHIA 8.3, SciPostPhys.Codeb.2022, 8(2022), doi:10.21468/SciPostPhysCodeb.8,2203.11601
work page internal anchor Pith review Pith/arXiv arXiv doi:10.21468/scipostphyscodeb.8 2022
-
[34]
DELPHES 3, A modular framework for fast simulation of a generic collider experiment
J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. Lemaître, A. Mertens and M. Selvaggi,DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP02, 057(2014), doi:10.1007/JHEP02(2014)057,1307.6346. 22 SciPost Physics Submission
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1007/jhep02(2014)057 2014
-
[35]
M. J. Oreglia,A study of the reactionsψ′ →γγψ, Ph.D. thesis, Stanford University, SLAC-R-236 (1980)
1980
-
[36]
J. E. Gaiser,Charmonium Spectroscopy From Radiative Decays of theJ/ψandψ ′, Ph.D. thesis, Stanford University, SLAC-R-255 (1982)
1982
-
[37]
G. Aadet al.,An implementation of neural simulation-based inference for parameter estimation in ATLAS(2024),2412.01600
arXiv 2024
-
[38]
G. Aadet al.,Measurement of off-shell Higgs boson production in theH ∗ → ZZ→4ℓdecay channel using a neural simulation-based inference technique in 13TeV pp collisions with the ATLAS detector, Rept. Prog. Phys.88(5), 057803 (2025), doi:10.1088/1361-6633/adcd9a,2412.01548
-
[39]
O. Amram, D. A. Faroughy, T. Gerdes, A. Hallin, G. Kasieczka, M. Krämer, H. Reyes- Gonzalez and D. Shih,Neural Scaling Laws for Jet Generation(2026),2605.28940. 23
Pith/arXiv arXiv 2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.