Learning Minimal-Deviation Corrections for Multi-Dimensional Mismodelling in HEP Simulations

Lucie Flek; Matthias Schott

arxiv: 2605.07460 · v1 · submitted 2026-05-08 · 💻 cs.LG · hep-ex

Learning Minimal-Deviation Corrections for Multi-Dimensional Mismodelling in HEP Simulations

Matthias Schott , Lucie Flek This is my paper

Pith reviewed 2026-05-11 01:55 UTC · model grok-4.3

classification 💻 cs.LG hep-ex

keywords Monte Carlo simulationneural network correctionhigh-energy physicsdistribution mismodellingminimal deviationone-dimensional constraintscorrelation preservationevent transformation

0 comments

The pith

A neural network learns minimal transformations to simulated events so they match one-dimensional target distributions while preserving multi-dimensional correlations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses the challenge of correcting Monte Carlo simulations in high-energy physics when data is available only in one dimension but mismatches occur across many features. Traditional reweighting ignores correlations between dimensions, and full multi-dimensional corrections demand large target samples that are often unavailable. The proposed method trains a neural network to apply a transformation to each simulated event, chosen so the transformed events reproduce the observed one-dimensional distributions. By enforcing closeness to the original simulation, the approach aims to keep the global correlation structure intact. Controlled tests with pseudo-data show better agreement with targets and consistent multi-dimensional behavior.

Core claim

The central claim is that a neural network can learn a transformation of simulated events that reproduces the available 1D target distributions while remaining close to the original simulation. This minimal-deviation principle preserves the global correlation structure of the baseline model while enabling targeted corrections of mismodelled features. Using controlled studies with simulated pseudo-data, the method improves agreement with target distributions and maintains a consistent multidimensional structure.

What carries the argument

The minimal-deviation transformation, a neural-network-learned mapping applied to simulated events that matches 1D marginals while minimizing deviation from the baseline simulation to retain original correlations.

Load-bearing premise

That enforcing minimal deviation from the original simulation while matching one-dimensional distributions will preserve the true multi-dimensional correlation structure of the baseline model.

What would settle it

Applying the learned transformation to a test set with known multi-dimensional correlations and finding that the 1D matches hold but the correlations deviate substantially from the original would falsify the preservation claim.

Figures

Figures reproduced from arXiv: 2605.07460 by Lucie Flek, Matthias Schott.

**Figure 1.** Figure 1: Schematic illustration of the network architectures: the Global Residual Transformation (left), which learns a single unified mapping, and the Two-Step Residual Transformation (right), where the transformation is factorized into two sequential residual updates [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Comparison of selected input features for the underlying event models of the original data-set (blue), the target data-set (yellow) and the transformed results, predicted by the neural network (violet) 0 50 100 150 200 250 300 350 m [GeV] 0.000 0.002 0.004 0.006 0.008 Normalised entries Original Target Transformed 0 100 200 300 400 500 mj1, j2 [GeV] 0.000 0.001 0.002 0.003 0.004 0.005 0.006 0.007 Normalise… view at source ↗

**Figure 3.** Figure 3: Comparison of derived observables based in the input features for the underlying event models of the original data-set (blue), the target data-set (yellow) and the transformed results, predicted by the neural network (violet) E Miss T Muon-1 pT Muon-1 Muon-2 pT Muon-2 Jet-1 pT Jet-1 Jet-1 Jet-2 pT Jet-2 Jet-2 Jet-3 pT E Miss T Muon-1 pT Muon-1 Muon-2 pT Muon-2 Jet-1 pT Jet-1 Jet-1 Jet-2 pT Jet-2 Jet-2 Jet-… view at source ↗

**Figure 4.** Figure 4: Comparison of Pearson correlation coefficients of the target data-set (left), the transformed results by the neural network (middle) and their difference (right) for the Global TTBar event models [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of selected input features for the underlying event models of the original data-set (blue), the target data-set (yellow) and the transformed results, predicted by the neural network (violet) In the second stage, a global residual network is applied to the output of the first step. This refinement network operates on the full feature vector and is trained to improve agreement in derived observabl… view at source ↗

**Figure 6.** Figure 6: Comparison of derived observables based in the input features for the underlying event models of the original data-set (blue), the target data-set (yellow) and the transformed results, predicted by the neural network (violet) The preservation of correlations is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison of Pearson correlation coefficients of the target data-set (left), the transformed results by the neural network (middle) and their difference (right) for the underlying event models. 7.3 Transfer between different Underlying Event Models via a Two-Step Residual Transformation We now consider a more complex transfer between different underlying event (UE) models, focusing on the transformation … view at source ↗

**Figure 8.** Figure 8: Comparison of selected input features for the underlying event models of the original data-set (blue), the target data-set (yellow) and the transformed results, predicted by the neural network (violet) 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 EEC Moment 0 2 4 6 8 10 12 Density Original Target Transformed 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Momentum Balance 0.0 0.5 1.0 1.5 2.0 Density Original Targ… view at source ↗

**Figure 9.** Figure 9: Comparison of derived observables based in the input features for the underlying event models of the original data-set (blue), the target data-set (yellow) and the transformed results, predicted by the neural network (violet) smaller than typical systematic uncertainties in such analyses, suggesting that the proposed approach is suitable for use in realistic multivariate workflows. 8 Conclusion In this wor… view at source ↗

**Figure 10.** Figure 10: Comparison of Pearson correlation coefficients of the target data-set (left), the transformed results by the neural network (middel) and their difference (right) for the underlying event models. 0.0 0.2 0.4 0.6 0.8 1.0 False Positive Rate 0.0 0.2 0.4 0.6 0.8 1.0 True Positive Rate Original ROC (AUC = 0.8363) Transformed ROC (AUC = 0.8153) 0.0 0.2 0.4 0.6 0.8 1.0 False Positive Rate 0.0 0.2 0.4 0.6 0.8 1.0… view at source ↗

**Figure 11.** Figure 11: ROC curve for a classifier trained to distinguish events from the target dataset and the transformed dataset for the underlying event models (left). In addition, a comparison of two ROC curves is shown (right): one from a classifier trained to separate the original and target datasets, and the corresponding performance when evaluated on the transformed dataset. single neural network performs a joint corre… view at source ↗

read the original abstract

Accurate Monte Carlo (MC) modelling in high-energy physics is challenging, particularly in complex scenarios where simulations fail to reproduce observed data. In practice, experimental information is often limited to one-dimensional (1D) distributions, while mismodelling arises in a multidimensional feature space. This restricts traditional correction methods, as one-dimensional reweighting ignores correlations and fully multidimensional approaches require large target datasets. We propose a neural network-based method that operates under these constraints by learning a transformation of simulated events that reproduces the available 1D target distributions while remaining close to the original simulation. This minimal-deviation principle preserves the global correlation structure of the baseline model while enabling targeted corrections of mismodelled features. Using controlled studies with simulated pseudo-data, we show that the method improves agreement with target distributions and maintains a consistent multidimensional structure. The approach is designed for complex, high-dimensional analyses where traditional techniques are insufficient, providing a scalable way to enhance MC modelling under limited information.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a neural network method to apply minimal-deviation corrections to multi-dimensional HEP simulation mismodelling when only 1D data is available, with the main support coming from pseudo-data tests.

read the letter

The main thing to know is that this work describes a neural network that learns to transform simulated events so they match given 1D target distributions while keeping the overall changes small. The goal is to fix mismodelling across many dimensions without needing full multi-dimensional data from experiment, and the minimal-deviation choice is meant to leave the baseline model's correlation structure mostly untouched.

Referee Report

3 major / 2 minor

Summary. The paper proposes a neural network-based method to learn a transformation of simulated events in high-energy physics Monte Carlo simulations. The transformation is designed to reproduce available one-dimensional target distributions while minimizing deviation from the original simulation, thereby preserving the baseline model's multi-dimensional correlation structure. The approach is validated through controlled studies on simulated pseudo-data demonstrating improved agreement with targets and maintained multi-dimensional consistency.

Significance. If the empirical results hold under the stated constraints, the method addresses a practical limitation in HEP where only 1D data is typically available for corrections in high-dimensional spaces. It offers a scalable alternative to traditional reweighting (which ignores correlations) or full multi-D methods (which require large target samples), potentially improving the fidelity of MC modeling for complex analyses without introducing large structural changes.

major comments (3)

[controlled pseudo-data studies] The validation in the controlled pseudo-data studies (described in the abstract and method validation section) is load-bearing for the central claim of improved 1D agreement and preserved multi-D structure, yet lacks explicit details on the quantitative metrics employed, the choice of baseline methods for comparison, error propagation, or generalization tests beyond the specific pseudo-data setup. This omission limits independent assessment of whether the minimal-deviation principle reliably achieves the reported outcomes.
[method description] The minimal-deviation principle is introduced as an explicit design choice (abstract and method description) rather than derived from data or self-consistent equations. The paper should specify the exact form of the loss function or regularization term used to enforce closeness to the original simulation and demonstrate that this does not inadvertently alter correlations in ways not captured by the 1D matching.
[assumptions and validation] The approach rests on the axiom that minimal changes preserve the original multi-dimensional correlation structure. While the pseudo-data studies test this under controlled conditions, the manuscript would benefit from a sensitivity analysis showing robustness when this assumption is mildly violated, as this is central to claiming superiority over methods that explicitly model correlations.

minor comments (2)

[abstract] The abstract could more precisely indicate the dimensionality of the feature space and the number of 1D target distributions used in the studies to provide immediate context for the method's applicability.
[notation] Notation for the transformation and deviation measure should be defined consistently in the main text to avoid ambiguity when discussing the neural network architecture.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript arXiv:2605.07460. We address each of the major comments below and outline the revisions we will make to improve the paper.

read point-by-point responses

Referee: The validation in the controlled pseudo-data studies lacks explicit details on the quantitative metrics employed, the choice of baseline methods for comparison, error propagation, or generalization tests beyond the specific pseudo-data setup.

Authors: We will revise the manuscript to include detailed descriptions of the quantitative metrics used for evaluating 1D agreement (such as chi-squared tests and Kolmogorov-Smirnov statistics) and multi-dimensional consistency (e.g., correlation matrices and mutual information measures). We will also specify the baseline methods, including no-correction and standard reweighting approaches, describe error propagation using Monte Carlo bootstrapping, and add generalization tests on additional pseudo-data scenarios with varying dimensions and mismodelling levels. This will enhance the transparency and allow better assessment of the results. revision: yes
Referee: The minimal-deviation principle is introduced as an explicit design choice rather than derived from data or self-consistent equations. The paper should specify the exact form of the loss function or regularization term used to enforce closeness to the original simulation and demonstrate that this does not inadvertently alter correlations in ways not captured by the 1D matching.

Authors: In the revised manuscript, we will explicitly define the loss function in the methods section. It consists of a primary term that minimizes the discrepancy between the transformed simulation and the 1D target distributions, combined with a regularization term that enforces minimal deviation, formulated as the mean squared difference between the transformed and original event features. We will demonstrate through both theoretical argument and empirical results that this approach preserves correlations because the transformation is a smooth, per-event mapping without introducing cross-event dependencies, and the 1D matching is achieved without forcing changes to higher-order structures. revision: yes
Referee: The approach rests on the axiom that minimal changes preserve the original multi-dimensional correlation structure. While the pseudo-data studies test this under controlled conditions, the manuscript would benefit from a sensitivity analysis showing robustness when this assumption is mildly violated.

Authors: We concur that a sensitivity analysis would strengthen the claims. We will incorporate such an analysis by generating pseudo-data with mild violations of the correlation preservation assumption (e.g., by perturbing the underlying joint distributions slightly) and showing that the method still achieves good 1D agreement with only minor impacts on the multi-dimensional structure, outperforming alternatives. This will be added to the validation section. revision: yes

Circularity Check

0 steps flagged

No significant circularity; minimal-deviation principle is an explicit design choice with independent empirical validation

full rationale

The paper's core proposal is a neural network that learns a transformation matching given 1D target distributions while staying close to the original simulation via an explicitly stated minimal-deviation principle. This principle is introduced as a modeling choice, not derived from or reduced to fitted parameters, self-referential equations, or prior self-citations. Controlled studies on pseudo-data provide direct empirical tests of both 1D matching and preservation of multi-dimensional correlations, making the validation independent of the method's internal construction. No steps in the provided abstract or described chain exhibit self-definition, fitted-input-as-prediction, or ansatz smuggling; the argument remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The method rests on the ad-hoc minimal-deviation principle as a training constraint and standard assumptions about neural network expressivity; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (2)

domain assumption Neural networks can approximate transformations that match 1D distributions while minimizing deviation from input simulations
Implicit in proposing the NN as the core mechanism for learning the correction.
ad hoc to paper Minimal changes to simulated events preserve the original multi-dimensional correlation structure
This is the load-bearing design principle stated in the abstract for maintaining consistency beyond 1D matches.

pith-pipeline@v0.9.0 · 5458 in / 1470 out tokens · 45601 ms · 2026-05-11T01:55:18.616556+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a neural network-based method that operates under these constraints by learning a transformation of simulated events that reproduces the available 1D target distributions while remaining close to the original simulation. This minimal-deviation principle preserves the global correlation structure
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

L = λ_hist L_hist + λ_der L_der + λ_move L_move + λ_corr L_corr

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages

[1]

Torbjorn Sjostrand, Stephen Mrenna, and Peter Z. Skands. A Brief Introduction to PYTHIA 8.1.Comput. Phys. Commun., 178:852–867, 2008

work page 2008
[2]

NeuralNetworksforFullPhase-spaceReweightingandParameter Tuning.Phys

AndersAndreassenandBenjaminNachman. NeuralNetworksforFullPhase-spaceReweightingandParameter Tuning.Phys. Rev. D, 101(9):091901, 2020

work page 2020
[3]

Komiske, Eric M

Anders Andreassen, Patrick T. Komiske, Eric M. Metodiev, Benjamin Nachman, and Jesse Thaler. OmniFold: A Method to Simultaneously Unfold All Observables.Phys. Rev. Lett., 124(18):182001, 2020

work page 2020
[4]

Rogozhnikov

A. Rogozhnikov. Reweighting with Boosted Decision Trees.J. Phys. Conf. Ser., 762(1):012036, 2016

work page 2016
[5]

Approximating Likelihood Ratios with Calibrated Discrimi- native Classifiers

Kyle Cranmer, Juan Pavez, and Gilles Louppe. Approximating Likelihood Ratios with Calibrated Discrimi- native Classifiers. 6 2015

work page 2015
[6]

Neural conditional reweighting.Phys

Benjamin Nachman and Jesse Thaler. Neural conditional reweighting.Phys. Rev. D, 105(7):076015, 2022

work page 2022
[7]

One Flow to Correct Them all: Improving Simulations in High-Energy Physics with a Single Normalising Flow and a Switch.Comput

Caio Cesar Daumann, Mauro Donega, Johannes Erdmann, Massimiliano Galli, Jan Lukas Späh, and Davide Valsecchi. One Flow to Correct Them all: Improving Simulations in High-Energy Physics with a Single Normalising Flow and a Switch.Comput. Softw. Big Sci., 8(1):15, 2024

work page 2024
[8]

Computational optimal transport, 2020

Gabriel Peyré and Marco Cuturi. Computational optimal transport, 2020

work page 2020
[9]

NormalizingFlowsforProbabilisticModelingandInference.J

George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshmi- narayanan. NormalizingFlowsforProbabilisticModelingandInference.J. Machine Learning Res.,22(1):2617– 2680, 2021

work page 2021
[10]

The frontier of simulation-based inference.Proc

Kyle Cranmer, Johann Brehmer, and Gilles Louppe. The frontier of simulation-based inference.Proc. Nat. Acad. Sci., 117(48):30055–30062, 2020

work page 2020
[11]

A. J. Cannon, S. R. Sobie, and T. Q. Murdock. Bias correction of GCM precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?Journal of Climate, 28(17):6938–6959, 2015

work page 2015
[12]

Reweighting simulated events using machine-learning techniques in the CMS exper- iment.Eur

Aram Hayrapetyan et al. Reweighting simulated events using machine-learning techniques in the CMS exper- iment.Eur. Phys. J. C, 85(5):495, 2025

work page 2025
[13]

Revisiting classifier two-sample tests, 2018

David Lopez-Paz and Maxime Oquab. Revisiting classifier two-sample tests, 2018

work page 2018
[14]

Bias correction of gcm precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?Journal of Climate, 28(17):6938–6959, 2015

Douglas Maraun. Bias correction of gcm precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?Journal of Climate, 28(17):6938–6959, 2015

work page 2015
[15]

de Favereau, C

J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. Lemaître, A. Mertens, and M. Selvaggi. DELPHES 3, A modular framework for fast simulation of a generic collider experiment.JHEP, 02:057, 2014

work page 2014
[16]

Measurement of theZ/γ∗ boson transverse momentum distribution inppcollisions at √s = 7 TeV with the ATLAS detector.JHEP, 09:145, 2014

Georges Aad et al. Measurement of theZ/γ∗ boson transverse momentum distribution inppcollisions at √s = 7 TeV with the ATLAS detector.JHEP, 09:145, 2014

work page 2014
[17]

Tuning PYTHIA 8.1: the Monash 2013 Tune.Eur

Peter Skands, Stefano Carrazza, and Juan Rojo. Tuning PYTHIA 8.1: the Monash 2013 Tune.Eur. Phys. J. C, 74(8):3024, 2014

work page 2013

[1] [1]

Torbjorn Sjostrand, Stephen Mrenna, and Peter Z. Skands. A Brief Introduction to PYTHIA 8.1.Comput. Phys. Commun., 178:852–867, 2008

work page 2008

[2] [2]

NeuralNetworksforFullPhase-spaceReweightingandParameter Tuning.Phys

AndersAndreassenandBenjaminNachman. NeuralNetworksforFullPhase-spaceReweightingandParameter Tuning.Phys. Rev. D, 101(9):091901, 2020

work page 2020

[3] [3]

Komiske, Eric M

Anders Andreassen, Patrick T. Komiske, Eric M. Metodiev, Benjamin Nachman, and Jesse Thaler. OmniFold: A Method to Simultaneously Unfold All Observables.Phys. Rev. Lett., 124(18):182001, 2020

work page 2020

[4] [4]

Rogozhnikov

A. Rogozhnikov. Reweighting with Boosted Decision Trees.J. Phys. Conf. Ser., 762(1):012036, 2016

work page 2016

[5] [5]

Approximating Likelihood Ratios with Calibrated Discrimi- native Classifiers

Kyle Cranmer, Juan Pavez, and Gilles Louppe. Approximating Likelihood Ratios with Calibrated Discrimi- native Classifiers. 6 2015

work page 2015

[6] [6]

Neural conditional reweighting.Phys

Benjamin Nachman and Jesse Thaler. Neural conditional reweighting.Phys. Rev. D, 105(7):076015, 2022

work page 2022

[7] [7]

One Flow to Correct Them all: Improving Simulations in High-Energy Physics with a Single Normalising Flow and a Switch.Comput

Caio Cesar Daumann, Mauro Donega, Johannes Erdmann, Massimiliano Galli, Jan Lukas Späh, and Davide Valsecchi. One Flow to Correct Them all: Improving Simulations in High-Energy Physics with a Single Normalising Flow and a Switch.Comput. Softw. Big Sci., 8(1):15, 2024

work page 2024

[8] [8]

Computational optimal transport, 2020

Gabriel Peyré and Marco Cuturi. Computational optimal transport, 2020

work page 2020

[9] [9]

NormalizingFlowsforProbabilisticModelingandInference.J

George Papamakarios, Eric Nalisnick, Danilo Jimenez Rezende, Shakir Mohamed, and Balaji Lakshmi- narayanan. NormalizingFlowsforProbabilisticModelingandInference.J. Machine Learning Res.,22(1):2617– 2680, 2021

work page 2021

[10] [10]

The frontier of simulation-based inference.Proc

Kyle Cranmer, Johann Brehmer, and Gilles Louppe. The frontier of simulation-based inference.Proc. Nat. Acad. Sci., 117(48):30055–30062, 2020

work page 2020

[11] [11]

A. J. Cannon, S. R. Sobie, and T. Q. Murdock. Bias correction of GCM precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?Journal of Climate, 28(17):6938–6959, 2015

work page 2015

[12] [12]

Reweighting simulated events using machine-learning techniques in the CMS exper- iment.Eur

Aram Hayrapetyan et al. Reweighting simulated events using machine-learning techniques in the CMS exper- iment.Eur. Phys. J. C, 85(5):495, 2025

work page 2025

[13] [13]

Revisiting classifier two-sample tests, 2018

David Lopez-Paz and Maxime Oquab. Revisiting classifier two-sample tests, 2018

work page 2018

[14] [14]

Bias correction of gcm precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?Journal of Climate, 28(17):6938–6959, 2015

Douglas Maraun. Bias correction of gcm precipitation by quantile mapping: How well do methods preserve changes in quantiles and extremes?Journal of Climate, 28(17):6938–6959, 2015

work page 2015

[15] [15]

de Favereau, C

J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. Lemaître, A. Mertens, and M. Selvaggi. DELPHES 3, A modular framework for fast simulation of a generic collider experiment.JHEP, 02:057, 2014

work page 2014

[16] [16]

Measurement of theZ/γ∗ boson transverse momentum distribution inppcollisions at √s = 7 TeV with the ATLAS detector.JHEP, 09:145, 2014

Georges Aad et al. Measurement of theZ/γ∗ boson transverse momentum distribution inppcollisions at √s = 7 TeV with the ATLAS detector.JHEP, 09:145, 2014

work page 2014

[17] [17]

Tuning PYTHIA 8.1: the Monash 2013 Tune.Eur

Peter Skands, Stefano Carrazza, and Juan Rojo. Tuning PYTHIA 8.1: the Monash 2013 Tune.Eur. Phys. J. C, 74(8):3024, 2014

work page 2013