Recognition: 2 theorem links
· Lean TheoremShapes are not enough: CONSERVAttack and its use for finding vulnerabilities and uncertainties in machine learning applications
Pith reviewed 2026-05-15 11:17 UTC · model grok-4.3
The pith
A new adversarial attack called CONSERVAttack can exploit undetected deviations between simulated and real data in high energy physics machine learning applications while remaining within uncertainty bounds.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The CONSERVAttack is designed to exploit the remaining space of hypothetical deviations between simulation and data after physically motivated systematic uncertainty tests, marginal distribution comparisons, and linear feature correlation checks. The resulting adversarial perturbations are consistent within the uncertainty bounds, evading standard validation checks, while successfully fooling the underlying model. The authors argue that robustness to adversarial effects must be considered when interpreting results from deep learning in particle physics and propose strategies to mitigate such vulnerabilities.
What carries the argument
The CONSERVAttack, an adversarial method that generates perturbations consistent within uncertainty bounds to evade standard validation while fooling the model.
If this is right
- Robustness to adversarial effects must be considered when interpreting results from deep learning in particle physics.
- Standard validation checks based on physical motivation and linear correlations are insufficient to guarantee no exploitable deviations.
- Machine learning models in physics may produce unreliable outputs if such undetected deviations exist.
- Mitigation strategies can be developed to reduce vulnerabilities to attacks that respect uncertainty bounds.
Where Pith is reading between the lines
- The same vulnerability could appear in other domains that use machine learning with simulated data, such as climate science or astrophysics.
- Tests for higher-order statistical correlations beyond linear ones might close some of the remaining deviation space.
- Adversarial training during model development could serve as a practical defense in physics analysis pipelines.
Load-bearing premise
Hypothetical deviations between simulation and data exist within uncertainty bounds that are not captured by physically motivated tests, marginal distributions, or linear feature correlations.
What would settle it
Apply CONSERVAttack to a real high energy physics machine learning model, generate perturbations, and test whether they fool the model while passing all standard validation checks and remaining within uncertainty bounds; inability to produce such fooling perturbations would falsify the claim.
read the original abstract
In High Energy Physics, as in many other fields of science, the application of machine learning techniques has been crucial in advancing our understanding of fundamental phenomena. Increasingly, deep learning models are applied to analyze both simulated and experimental data. In most experiments, a rigorous regime of testing for physically motivated systematic uncertainties is in place. The numerical evaluation of these tests for differences between the data on the one side and simulations on the other side quantifies the effect of potential sources of mismodelling on the machine learning output. In addition, thorough comparisons of marginal distributions and (linear) feature correlations between data and simulation in "control regions" are applied. However, the guidance by physical motivation, and the need to constrain comparisons to specific regions, does not guarantee that all possible sources of deviations have been accounted for. We therefore propose a new adversarial attack - the CONSERVAttack - designed to exploit the remaining space of hypothetical deviations between simulation and data after the above mentioned tests. The resulting adversarial perturbations are consistent within the uncertainty bounds - evading standard validation checks - while successfully fooling the underlying model. We further propose strategies to mitigate such vulnerabilities and argue that robustness to adversarial effects must be considered when interpreting results from deep learning in particle physics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CONSERVAttack, a new adversarial attack for deep learning models in High Energy Physics. It argues that standard validation procedures—physically motivated systematic uncertainty tests, marginal distribution comparisons, and linear feature correlations in control regions—do not exhaust all possible deviations between simulation and data. The attack is claimed to generate perturbations that remain inside stated uncertainty bounds (thereby evading the listed checks) while altering the model's output. Mitigation strategies are outlined and the authors recommend that adversarial robustness be considered when interpreting ML results in particle physics.
Significance. If a concrete, reproducible construction of CONSERVAttack were supplied together with quantitative results on a representative HEP task (e.g., jet tagging or event classification) showing that the perturbations pass the cited validation tests yet change the model prediction, the work would usefully highlight gaps in current shape-based validation practices and motivate more comprehensive robustness checks. The absence of any algorithm, pseudocode, or empirical demonstration in the manuscript, however, leaves the central claim unsubstantiated at present.
major comments (2)
- [Abstract] Abstract: the assertion that CONSERVAttack produces perturbations 'consistent within the uncertainty bounds—evading standard validation checks—while successfully fooling the underlying model' is presented without any explicit generation procedure, pseudocode, or quantitative evidence on a concrete task. This directly undermines the central claim that such perturbations exist and can be systematically found.
- No section or equation supplies the construction that would satisfy the weakest assumption: perturbations that pass physically motivated tests, marginal distributions, and linear correlations in control regions yet still change the ML output. Without this, the attack remains an invented entity rather than a demonstrated method.
minor comments (1)
- The title 'Shapes are not enough' is evocative but the manuscript would benefit from an explicit definition of 'shapes' in the opening paragraphs to clarify its relation to the marginal and correlation tests discussed.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. The points raised correctly identify that the current version presents CONSERVAttack primarily as a conceptual proposal without supplying the explicit construction, pseudocode, or quantitative demonstration needed to substantiate the central claims. We will revise the manuscript to include these elements.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that CONSERVAttack produces perturbations 'consistent within the uncertainty bounds—evading standard validation checks—while successfully fooling the underlying model' is presented without any explicit generation procedure, pseudocode, or quantitative evidence on a concrete task. This directly undermines the central claim that such perturbations exist and can be systematically found.
Authors: We agree that the abstract states the existence of such perturbations without providing the supporting procedure or evidence. The manuscript was written as a conceptual contribution to highlight potential gaps in existing validation practices. In the revision we will rewrite the abstract to describe CONSERVAttack as a proposed method, add a dedicated section containing the explicit optimization procedure (including how perturbations are constrained to lie inside the stated uncertainty bounds), pseudocode, and quantitative results on a representative HEP task such as jet tagging or event classification. revision: yes
-
Referee: [—] No section or equation supplies the construction that would satisfy the weakest assumption: perturbations that pass physically motivated tests, marginal distributions, and linear correlations in control regions yet still change the ML output. Without this, the attack remains an invented entity rather than a demonstrated method.
Authors: This observation is accurate. The present manuscript does not contain the required construction or verification. We will add the mathematical formulation of the attack, the algorithm that generates perturbations respecting the uncertainty bounds while altering the model output, and explicit checks confirming that the resulting perturbations pass the physically motivated systematic tests, marginal distribution comparisons, and linear feature correlations in control regions. These additions will be accompanied by numerical experiments on a concrete HEP classification task. revision: yes
Circularity Check
No circularity: CONSERVAttack is a methodological proposal without self-referential reduction
full rationale
The paper introduces CONSERVAttack as a new adversarial method to probe hypothetical deviations between simulation and data that remain after standard physically motivated tests, marginal distributions, and linear correlations. No equations, fitted parameters, or self-citations appear in the provided text that would make the attack's construction equivalent to its inputs by definition. The central claim rests on the existence of such perturbations and the attack's ability to generate them while evading checks, presented as an independent proposal rather than a renaming or redefinition of existing quantities. This is a standard non-circular methodological contribution.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Current physically motivated tests, marginal distributions, and linear correlations do not exhaust all possible deviations between simulation and data that lie inside uncertainty bounds.
invented entities (1)
-
CONSERVAttack
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The attack proceeds via an iterative, gradient-based, brute-force–style search over candidate perturbations... L:=αD_JS +β∆_F
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
preserving marginal feature distributions and preserving the inter-feature correlations
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Uncovering Hidden Systematics in Neural Network Models for High Energy Physics
Neural networks for HEP tasks can be fooled at significant rates by subtle perturbations inside uncertainty envelopes, revealing hidden systematics not captured by conventional methods.
Reference graph
Works this paper leans on
-
[1]
Miotto, R., Wang, F., Wang, S., Jiang, X., Dudley, J.T.: Deep learning for healthcare: review, opportunities and challenges. Brief- ings in Bioinformatics19(6), 1236–1246 (2017) https://doi.org/10.1093/bib/bbx044 https://academic.oup.com/bib/article- pdf/19/6/1236/27119191/bbx044.pdf
-
[2]
Chib, P.S., Singh, P.: Recent advancements in end-to-end autonomous driving using deep learning: A survey. IEEE Transactions on Intelligent Vehicles9(1), 103–118 (2024) https://doi.org/10.1109/TIV.2023.3318070
-
[3]
A Survey of Large Language Models
Zhao, W.X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., Min, Y., Zhang, B., Zhang, J., Dong, Z., Du, Y., Yang, C., Chen, Y., Chen, Z., Jiang, J., Ren, R., Li, Y., Tang, X., Liu, Z., Liu, P., Nie, J.-Y., Wen, J.-R.: A Survey of Large Language Models (2025). https:// arxiv.org/abs/2303.18223
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[4]
Cowan, G., Cranmer, K., Gross, E., Vitells, O.: Asymptotic formulae for likelihood-based tests of new physics. Eur. Phys. J. C71, 1554 (2011) https: //doi.org/10.1140/epjc/s10052-011-1554-0 arXiv:1007.1727 [physics.data-an]. [Erratum: Eur.Phys.J.C 73, 2501 (2013)]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.1140/epjc/s10052-011-1554-0 2011
-
[5]
Aad, G.,et al.: Observation of a new particle in the search for the Standard Model Higgs boson with the ATLAS detector at the LHC. Phys. Lett. B 716, 1–29 (2012) https://doi.org/10.1016/ j.physletb.2012.08.020 arXiv:1207.7214 [hep- ex]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[6]
Chatrchyan, S.,et al.: Observation of a New Boson at a Mass of 125 GeV with the CMS Experiment at the LHC. Phys. Lett. B 716, 30–61 (2012) https://doi.org/10.1016/ 21 j.physletb.2012.08.021 arXiv:1207.7235 [hep- ex]
work page internal anchor Pith review Pith/arXiv arXiv 2012
-
[7]
IEEE Access12, 61113–61136 (2024) https: //doi.org/10.1109/ACCESS.2024.3395118
Costa, J.C., Roxo, T., Proen¸ ca, H., In´ acio, P.R.M.: How deep learning sees the world: A survey on adversarial attacks & defenses. IEEE Access12, 61113–61136 (2024) https: //doi.org/10.1109/ACCESS.2024.3395118
-
[8]
Explaining and Harnessing Adversarial Examples
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and Harnessing Adversarial Examples (2015). https://arxiv.org/abs/ 1412.6572
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[9]
Towards Deep Learning Models Resistant to Adversarial Attacks
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards Deep Learning Mod- els Resistant to Adversarial Attacks (2019). https://arxiv.org/abs/1706.06083
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[10]
https://arxiv.org/abs/ 2101.08030
Cartella, F., Anunciacao, O., Funabiki, Y., Yamaguchi, D., Akishita, T., Elshocht, O.: Adversarial Attacks for Tabular Data: Application to Fraud Detection and Imbal- anced Data (2021). https://arxiv.org/abs/ 2101.08030
-
[11]
Zhang, W.E., Sheng, Q.Z., Alhazmi, A., Li, C.: Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey (2019). https://arxiv.org/abs/1901. 06796
work page 2019
-
[12]
Computing and Software for Big Science9(1), 19 (2025) https://doi.org/ 10.1007/s41781-025-00148-1
Flek, L., Jung, P.A., Karimi, A., Saala, T., Schmidt, A., Schott, M., Soldin, P., Wiebusch, C.H.: Enforcing fundamental rela- tions via adversarial attacks on input parame- ter correlations. Computing and Software for Big Science9(1), 19 (2025) https://doi.org/ 10.1007/s41781-025-00148-1
-
[13]
https:// arxiv.org/abs/2511.01352
Flek, L., Janik, O., Jung, P.A., Karimi, A., Saala, T., Schmidt, A., Schott, M., Soldin, P., Thiesmeyer, M., Wiebusch, C., Willem- sen, U.: MiniFool – Physics-Constraint- Aware Minimizer-Based Adversarial Attacks in Deep Neural Networks (2025). https:// arxiv.org/abs/2511.01352
-
[14]
Open Journal of Mathematical Opti- mization3, 1–85 (2022) https://doi.org/10
Rahimian, H., Mehrotra, S.: Frameworks and results in distributionally robust optimiza- tion. Open Journal of Mathematical Opti- mization3, 1–85 (2022) https://doi.org/10. 5802/ojmo.15
work page 2022
-
[15]
https:// kaggle.com/competitions/higgs-boson
Kegl, B., CecileGermain, ChallengeAdmin, ClaireAdam, Rousseau, D., Djabbz, fra- dav, Cowan, G., Isabelle, joycenv: Higgs Boson Machine Learning Challenge. https:// kaggle.com/competitions/higgs-boson. Kag- gle (2014)
work page 2014
-
[16]
https://opendata.cern.ch/ Accessed 2026-02- 20
CERN: CERN Open Data Portal (2026). https://opendata.cern.ch/ Accessed 2026-02- 20
work page 2026
-
[17]
CMS collaboration: Simulated dataset TTJets FullLeptMGDecays TuneP11TeV 8TeV- madgraph-tauola in AODSIM format for 2012 collision data. CERN Open Data Portal (2017). https://doi.org/10.7483/ OPENDATA.CMS.7RZ3.0BXP
work page 2012
-
[18]
CMS collaboration: Simulated dataset WWJetsTo2L2Nu TuneZ2star 8TeV- madgraph-tauola in AODSIM format for 2012 collision data. CERN Open Data Portal (2017). https://doi.org/10.7483/ OPENDATA.CMS.V2C6.O1P4
work page 2012
-
[19]
Kasieczka, G., Plehn, T., Butter, A., Cran- mer, K., Debnath, D., Dillon, B.M., Fair- bairn, M., Faroughy, D.A., Fedorko, W., Gay, C., Gouskos, L., Kamenik, J.F., Komiske, P.T., Leiss, S., Lister, A., Macaluso, S., Metodiev, E.M., Moore, L., Nachman, B., Nordstr¨ om, K., Pearkes, J., Qu, H., Rath, Y., Rieger, M., Shih, D., Thomp- son, J.M., Varma, S.: The...
work page 2019
-
[20]
In: International Con- ference on Learning Representations (2019)
Tsipras, D., Santurkar, S., Engstrom, L., Turner, A., Madry, A.: Robustness may be at odds with accuracy. In: International Con- ference on Learning Representations (2019). https://openreview.net/forum?id=SyxAb30cY7
work page 2019
-
[21]
On Detecting Adversarial Perturbations
Metzen, J.H., Genewein, T., Fischer, V., Bischoff, B.: On Detecting Adversarial Perturbations (2017). https://arxiv.org/abs/ 1702.04267 22
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[22]
https: //arxiv.org/abs/2108.00401
Akhtar, N., Mian, A., Kardan, N., Shah, M.: Advances in adversarial attacks and defenses in computer vision: A survey (2021). https: //arxiv.org/abs/2108.00401
-
[23]
Practical Black-Box Attacks against Machine Learning
Papernot, N., McDaniel, P., Goodfellow, I., Jha, S., Celik, Z.B., Swami, A.: Practical Black-Box Attacks against Machine Learning (2017). https://arxiv.org/abs/1602.02697
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[24]
CMS collaboration: SingleMu primary dataset in AOD format from Run of 2012. CERN Open Data Portal (2017). https://doi. org/10.7483/OPENDATA.CMS.9A4E.7SIR
-
[25]
Sz´ ekely, G.J., Rizzo, M.L., Bakirov, N.K.: Measuring and testing dependence by cor- relation of distances. The Annals of Statis- tics35(6) (2007) https://doi.org/10.1214/ 009053607000000505 Appendix A Attack Optimization A.1 Jensen Shannon Distance Calculation When evaluating the effect of small perturbations on feature histograms (as required for Jense...
work page 2007
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.