arxiv: 2510.07463 · v2 · submitted 2025-10-08 · ✦ hep-ex

Boosted decision tree reweighting of simulated neutrino interactions for O(1) GeV neutrino cross section measurements

Z. Lin , S. Akhter , Z. Ahmad Dar , N.S. Alex , M. Betancourt , S. Boyd , H. Budd , G. Caceres

show 33 more authors

G.A. D\'iaz J. Felix L. Fields A.M. Gago P.K.Gaur S.M. Gilligan R. Gran D.A. Harris A.L. Hart J. Kleykamp A. Klustov\'a D. Last A. Lozano X.-G. Lu S. Manly W.A. Mann K.S. McFarland O. Moreno J.K. Nelson V. Paolone G.N. Perdue C. Pernas M.A. Ram\'irez N. Roy D. Ruterbories H. Schellman C. J. Solano Salinas D. S. Correia M. Sultana N.H. Vaughan A.V. Waldron B. Yaeggy L. Zazueta (The MINERvA Collaboration)

This is my paper

Pith reviewed 2026-05-18 09:08 UTC · model grok-4.3

classification ✦ hep-ex

keywords neutrino Monte Carloboosted decision treereweightingneutrino cross sectionsMINERvAquasielastic interactionstransverse kinematic imbalance

0 comments p. Extension

The pith

A boosted decision tree reweights neutrino Monte Carlo events to match a target generator's distributions and efficiencies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes a method for reweighting simulated O(1) GeV neutrino interaction events using a boosted decision tree. The tree learns from high-dimensional detector final-state observables to adjust events so that their reconstructed particle content and kinematics match those of a different target model. The reweighting also aligns the detector efficiency. This approach allows reuse of existing Monte Carlo samples instead of generating new ones for each model variation. It is illustrated with an application to transverse kinematic imbalance measurements in charged-current quasielastic-like events from the MINERvA experiment.

Core claim

The authors show that training a boosted decision tree on high-dimensional final-state observables enables the reweighting of events from one neutrino interaction generator to reproduce the reconstructed distributions and detector efficiencies of a target generator. This provides an efficient mechanism to adapt legacy Monte Carlo data for cross-section measurements without the computational cost of re-generation.

What carries the argument

Boosted decision tree trained on high-dimensional detector final-state observables for multi-dimensional event reweighting.

If this is right

Legacy Monte Carlo data can be reused for different neutrino interaction models.
Reconstructed kinematics and particle content can be matched across generators.
Detector efficiency corrections can be applied as part of the reweighting process.
The method supports specific physics analyses such as transverse kinematic imbalance in MINERvA data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If successful, this technique could lower the barrier to testing multiple models in neutrino cross-section studies by reducing simulation needs.
It opens the possibility of applying similar reweighting to other experiments or interaction types at similar energies.

Load-bearing premise

The boosted decision tree accurately captures all relevant differences between the source and target generators from the chosen high-dimensional observables without bias or omission of important effects.

What would settle it

Observing that the reweighted distributions deviate significantly from the target model in a validation sample of events or that the resulting cross-section measurement differs from one obtained with directly generated target events would falsify the claim.

Figures

Figures reproduced from arXiv: 2510.07463 by A. Klustov\'a, A.L. Hart, A. Lozano, A.M. Gago, A.V. Waldron, B. Yaeggy, C. J. Solano Salinas, C. Pernas, D.A. Harris, D. Last, D. Ruterbories, D. S. Correia, G.A. D\'iaz, G. Caceres, G.N. Perdue, H. Budd, H. Schellman, J. Felix, J. Kleykamp, J.K. Nelson, K.S. McFarland, L. Fields, L. Zazueta (The MINERvA Collaboration), M.A. Ram\'irez, M. Betancourt, M. Sultana, N.H. Vaughan, N. Roy, N.S. Alex, O. Moreno, P.K.Gaur, R. Gran, S. Akhter, S. Boyd, S. Manly, S.M. Gilligan, V. Paolone, W.A. Mann, X.-G. Lu, Z. Ahmad Dar, Z. Lin.

**Figure 2.** Figure 2: FIG. 2. Schematic illustration of the single-transverse kine [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. Categorical histogram of MINERvA ME CCQE [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. Differential cross-sections of categories “1p0n”, “1pNn”, “2pNn”, “2pNn”, and “others” combined are plotted with [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. Differential cross-sections of all categories combined are plotted with respect to calorimetric momenta [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. Contour plot of proton detecting efficiency model [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7. 2D differential cross-sections with respect to [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: FIG. 8. 2D efficiency ratio plot. a [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

**Figure 9.** Figure 9: FIG. 9. Differential cross-sections of category 0p0n are plotted with respect to calorimetric momenta [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 10.** Figure 10: FIG. 10. Differential cross-sections of category 0pNn are plotted with respect to calorimetric momenta [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

**Figure 11.** Figure 11: FIG. 11. Differential cross-sections of category 1p0n are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗

**Figure 12.** Figure 12: FIG. 12. Differential cross-sections of category 1pNn are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗

**Figure 13.** Figure 13: FIG. 13. Differential cross-sections of category 2p0n are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗

**Figure 14.** Figure 14: FIG. 14. Differential cross-sections of category 2pNn are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p016_14.png] view at source ↗

**Figure 15.** Figure 15: FIG. 15. Differential cross-sections of category are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗

read the original abstract

This paper illustrates a generic method for multi-dimensional reweighting of $O(1)$ GeV neutrino interaction Monte Carlo samples. The reweighting is based on a Boosted Decision Tree algorithm trained on high-dimensional space in detector final-state observables. This enables one generator's events to be reweighted so that its reconstructed particle content and kinematics distributions, as well as detector efficiency, match those of a target model. The approach establishes an efficient way to reuse legacy Monte Carlo data, avoiding re-generation. As an example, we test its use in a measurement of transverse kinematic imbalance of the $\mu^-$ and proton in charged-current quasielastic like $\nu_\mu$ events from the MINERvA experiment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BDT reweighting gives a practical shortcut for matching neutrino MC reco distributions and efficiency without new generation, but the true-to-reco consistency for efficiency corrections needs explicit checks.

read the letter

This paper shows how to use boosted decision trees to reweight one neutrino interaction Monte Carlo sample to match the reconstructed distributions and efficiency of another generator. It is a practical tool for reusing existing simulations in cross section measurements at experiments like MINERvA. The new part is the multi-dimensional BDT approach applied to high-dimensional detector final-state observables for O(1) GeV neutrinos, with a test on transverse kinematic imbalance of the muon and proton in charged-current quasielastic like events. It does well by providing an efficient alternative to regenerating large MC datasets, which saves resources while allowing model comparisons without starting from scratch. The potential issue is whether reweighting on reco-level observables alone ensures the efficiency corrections remain accurate for the physics observable. Different generators have different true-level kinematics, interaction channels, and particle production, so the mapping to reconstructed quantities is generator dependent. The abstract indicates the method achieves matching efficiency, but I would want to see explicit comparisons or metrics showing that the effective efficiency for transverse kinematic imbalance is the same as if the target generator had been used directly. If the paper includes those checks, it would address the concern directly. This is useful for neutrino experimentalists who work with limited computing for MC and need to adapt samples for different models in their analyses. Readers interested in practical techniques for cross section extractions would benefit from the implementation details and the MINERvA example. I recommend putting it through peer review, as the technique is relevant to ongoing work in the field and could be adopted more widely with some additional validation on the efficiency preservation.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a generic method for multi-dimensional reweighting of O(1) GeV neutrino interaction Monte Carlo samples using a boosted decision tree (BDT) trained on high-dimensional detector final-state observables. This allows events from one generator to be reweighted to reproduce the reconstructed particle content, kinematics distributions, and detector efficiency of a target model, with an example application to a measurement of transverse kinematic imbalance in charged-current quasielastic-like ν_μ events from the MINERvA experiment. The approach aims to enable reuse of legacy MC data without regeneration.

Significance. If the central claim holds with proper validation, the method offers a practical way to adapt existing Monte Carlo samples across generators for neutrino cross-section analyses, which could reduce the computational burden of large-scale event generation in experiments like MINERvA. The algorithmic nature avoids circularity in derivations, and the focus on reco-level observables directly addresses a common need in efficiency-corrected measurements.

major comments (2)

[Abstract] Abstract and example application: the claim that BDT reweighting on reconstructed observables makes detector efficiency match the target generator is load-bearing for cross-section extraction but lacks quantitative validation. No metrics (e.g., efficiency ratios, response-matrix agreement, or before/after comparisons for the transverse kinematic imbalance observable) are reported to confirm that the true-to-reco mapping is preserved rather than altered by the post-reconstruction weights.
[Example application] The weakest assumption—that the BDT trained on high-dimensional reco observables captures all relevant generator differences without biasing the physics result—is not tested against the skeptic concern. Different generators vary in true-level kinematics and interaction channels, so reco-only weights may change the effective efficiency for the target observable without reproducing the target generator's full response matrix.

minor comments (2)

The manuscript would benefit from explicit discussion of how the BDT training avoids overtraining or extrapolation issues in sparsely populated regions of the high-dimensional reco space.
Notation for the reweighting procedure (e.g., definition of the BDT output as a weight factor) should be clarified with a simple equation or pseudocode for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful and constructive review of our manuscript on BDT reweighting of neutrino Monte Carlo samples. The comments have prompted us to strengthen the quantitative validation in the example application and to clarify the assumptions and limitations of the method. We have revised the manuscript accordingly and believe these changes address the major concerns while preserving the core contribution.

read point-by-point responses

Referee: [Abstract] Abstract and example application: the claim that BDT reweighting on reconstructed observables makes detector efficiency match the target generator is load-bearing for cross-section extraction but lacks quantitative validation. No metrics (e.g., efficiency ratios, response-matrix agreement, or before/after comparisons for the transverse kinematic imbalance observable) are reported to confirm that the true-to-reco mapping is preserved rather than altered by the post-reconstruction weights.

Authors: We agree that explicit quantitative validation is required to support the efficiency-matching claim for cross-section work. In the revised manuscript we have added a dedicated validation subsection with efficiency ratios computed before and after reweighting, element-wise comparisons of the response matrices for the selected sample, and direct before/after distributions of the transverse kinematic imbalance observable. These metrics show that the reweighting reproduces the target generator’s efficiency to within a few percent across the relevant kinematic range while leaving the true-to-reco mapping statistically unchanged. revision: yes
Referee: [Example application] The weakest assumption—that the BDT trained on high-dimensional reco observables captures all relevant generator differences without biasing the physics result—is not tested against the skeptic concern. Different generators vary in true-level kinematics and interaction channels, so reco-only weights may change the effective efficiency for the target observable without reproducing the target generator's full response matrix.

Authors: We acknowledge the validity of this skeptic concern. While the high-dimensional reco-level training is designed to capture efficiency differences as they appear in the detector, it does not explicitly enforce agreement at true level. To address this, the revised manuscript now includes a closure test in which the reweighted sample is used to extract the transverse kinematic imbalance cross section and is compared directly to the result obtained with the target generator’s native Monte Carlo; the two agree within uncertainties. We have also expanded the discussion of limitations to note that the method assumes generator differences are dominantly reflected in the reconstructed observables for the chosen selection, and that additional checks would be needed for observables more sensitive to true-level modeling. revision: yes

Circularity Check

0 steps flagged

No circularity in algorithmic reweighting method

full rationale

The paper describes a BDT-based reweighting procedure trained on high-dimensional reconstructed observables to match particle content, kinematics, and detector efficiency between generators. No equations, derivations, or predictions are presented that reduce by construction to fitted inputs or self-citations. The approach is purely algorithmic and self-contained, with the central claim resting on the training and application steps that can be independently validated or falsified against external benchmarks rather than relying on internal redefinitions or load-bearing prior self-work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes that differences between generators are fully captured by the chosen high-dimensional observables and that the BDT generalizes without overfitting.

axioms (1)

domain assumption Differences between neutrino generators can be learned and corrected via supervised training on detector-level observables alone.
Central to the reweighting approach described in the abstract.

pith-pipeline@v0.9.0 · 5871 in / 1149 out tokens · 24797 ms · 2026-05-18T09:08:45.610192+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The reweighting is based on a Boosted Decision Tree algorithm trained on high-dimensional space in detector final-state observables.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · 7 internal anchors

[1]

The GENIE Neutrino Monte Carlo Generator

C. Andreopoulos et al., Nucl. Instrum. Meth.A614, 87 (2010), arXiv:0905.2517 [hep-ph]

work page internal anchor Pith review Pith/arXiv arXiv 2010
[2]

The GENIE Neutrino Monte Carlo Generator: Physics and User Manual

C. Andreopoulos, C. Barry, S. Dytman, H. Gallagher, T. Golan, R. Hatcher, G. Perdue, and J. Yarba, “The GENIE Neutrino Monte Carlo Generator: Physics and User Manual,” (2015), arXiv:1510.05494 [hep-ph]

work page internal anchor Pith review Pith/arXiv arXiv 2015
[3]

Tena-Vidal et al

J. Tena-Vidal et al. (GENIE), Phys. Rev. D106, 112001 (2022), arXiv:2206.11050 [hep-ph]

work page arXiv 2022
[4]

Hayato and L

Y. Hayato and L. Pickering, The European Physical Journal Special Topics230, 4469–4481 (2021)

work page 2021
[5]

Golan, J

T. Golan, J. Sobczyk, and J. ˙Zmuda, Nuclear Physics B - Proceedings Supplements229-232, 499 (2012), neutrino 2010

work page 2012
[6]

Isaacson, W

J. Isaacson, W. I. Jay, A. Lovato, P. A. N. Machado, and N. Rocco, Phys. Rev. D107, 033007 (2023)

work page 2023
[7]

O. Buss, T. Gaitanos, K. Gallmeister, H. van Hees, 13 FIG. 11. Differential cross-sections of category 1p0n are plotted with respect to leading protonp x, py, pz (a, b, c); calorimetric energy P Tp (d);µ − py, pz (e, f); TKI variablesδp t, δαT , δϕT (g, h, i); and leading protonT p, θ(j, k). Histogram of weights (l) is plotted in log scale. Green: test sa...

work page 2012
[8]

Pickering, P

L. Pickering, P. Stowell, and J. Sobczyk, Journal of Physics: Conference Series888, 012175 (2017)

work page 2017
[9]

Aliaga et al

L. Aliaga et al. (MINERvA), Nucl. Instrum. Meth. A 743, 130 (2014)

work page 2014
[10]

Neutrino-nucleus interaction system- atic uncertainties and baseline model for dune anal- yses,

L. Munteanu, “Neutrino-nucleus interaction system- atic uncertainties and baseline model for dune anal- yses,” Talk at CEWG Meeting Aug 15 2024, In- dico (2024),https://indico.fnal.gov/event/65637/ #3-neutrino-nucleus-interaction

work page 2024
[11]

A techni- cal note describing the ar23 genie tune is in preparation at the time of submission of this manuscript,

S. Dolan, L. Munteanu, J. Kim, and R. Gran, “A techni- cal note describing the ar23 genie tune is in preparation at the time of submission of this manuscript,” (2025), placholder

work page 2025
[12]

Coadou, Yann, EPJ Web of Conferences55, 02004 (2013)

work page 2013
[13]

arogozhnikov/hep ml: hep ml v0.7.2,

A. Rogozhnikov, A. Ustyuzhanin, J. Eschle, R. Lane, A. Pearce, A. Thada, K. Gizdov, K. Schubert, and S. Mejlgaard, “arogozhnikov/hep ml: hep ml v0.7.2,” (2023)

work page 2023
[14]

Rogozhnikov, J

A. Rogozhnikov, J. Phys.: Conf. Ser.762, 012036 (2016)

work page 2016
[15]

Abratenko et al

P. Abratenko et al. (MicroBooNE Collaboration), Phys. Rev. D111, 092010 (2025)

work page 2025
[16]

A. A. e. a. Abud and O. behalf of the DUNE Collabora- 14 FIG. 12. Differential cross-sections of category 1pNn are plotted with respect to leading protonp x, py, pz (a, b, c); calorimetric energy P Tp (d);µ − py, pz (e, f); leading neutronp x, py, pz (g, h, i); TKI variablesδp t, δαT , δϕT (j, k, l); and leading proton Tp, θ(m, n). Histogram of weights (o...

work page doi:10.3390/instruments5040031 2021
[17]

Deep Underground Neutrino Experiment (DUNE), Far Detector Technical Design Report, Volume II: DUNE Physics,

B. A. et al and O. behalf of the DUNE Collaboration, “Deep underground neutrino experiment (dune), far de- tector technical design report, volume ii: Dune physics,” (2020), arXiv:2002.03005 [hep-ex]

work page arXiv 2020
[18]

M. A. Acero et al. (The NOvA Collaboration), Phys. Rev. D106, 032004 (2022)

work page 2022
[19]

J. H. Friedman, The Annals of Statistics29, 1189 (2001)

work page 2001
[20]

James, D

G. James, D. Witten, T. Hastie, R. Tib- shirani, J. Taylor, and S. O. service), An Introduction to Statistical Learning with Applications in Python, 1st ed., Springer Texts in Statistics (Springer Interna- tional Publishing, Cham, 2023) pp. 332–335

work page 2023
[21]

X. G. Lu, L. Pickering, S. Dolan, G. Barr, D. Coplowe, Y. Uchida, D. Wark, M. O. Wascko, A. Weber, and T. Yuan, Phys. Rev. C94, 015503 (2016), arXiv:1512.05748 [nucl-th]

work page internal anchor Pith review Pith/arXiv arXiv 2016
[22]

Cai et al

T. Cai et al. (The MINERνA Collaboration), Phys. Rev. D101, 092001 (2020)

work page 2020
[23]

Brun and F

R. Brun and F. Rademakers, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 389, 81 (1997), new Computing Techniques in Physics Research V

work page 1997
[24]

root- project/root: v6.18/02,

R. Brun, F. Rademakers, P. Canal, A. Naumann, O. Couet, L. Moneta, V. Vassilev, S. Linev, D. Pi- paro, G. GANIS, B. Bellenot, E. Guiraud, G. Ama- dio, wverkerke, P. Mato, TimurP, M. Tadel, wlav, E. Tejedor, J. Blomer, A. Gheata, S. Hageboeck, S. Roiser, marsupial, S. Wunsch, O. Shadura, A. Bose, CristinaCristescu, X. Valls, and R. Isemann, “root- project/...

work page 2019
[25]

NUISANCE: a neutrino cross-section generator tuning and comparison framework

P. Stowell et al., JINST12, P01016 (2017), arXiv:1612.07393 [hep-ex]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[26]

Zazueta et al., Physical Review D107(2023), 10.1103/physrevd.107.012001

L. Zazueta et al., Physical Review D107(2023), 10.1103/physrevd.107.012001

work page doi:10.1103/physrevd.107.012001 2023
[27]

Aliaga et al

L. Aliaga et al. (MINERνA Collaboration), Phys. Rev. D94, 092005 (2016)

work page 2016
[28]

L. A. Harewood and R. Gran, arXiv preprint (2019), arXiv:1906.10576 [hep-ex]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[29]

W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing (Third Edition) (Cambridge University Press, 2007) pp. 736–737

work page 2007
[30]

I. Melo, B. Tom´ aˇ sik, G. Torrieri, S. Vogel, M. Bleicher, S. Kor´ ony, and M. c. v. Gintner, Phys. Rev. C80, 024904 (2009)

work page 2009
[31]

Inclusive Charged--Current Neutrino--Nucleus Reactions

J. Nieves, I. Ruiz Simo, and M. Vicente Vacas, Phys. Rev. C83, 045501 (2011), arXiv:1102.2777 [hep- ph]

work page internal anchor Pith review Pith/arXiv arXiv 2011
[32]

Extensions of Superscaling from Relativistic Mean Field Theory: the SuSAv2 Model

R. Gonzal ˜A©z-Jim˜A©nez, G. D. Megias, M. B. Bar- baro, J. A. Caballero, and T. W. Donnelly, Phys. Rev. C90, 035501 (2014), arXiv:1407.8346 [nucl-th]

work page internal anchor Pith review Pith/arXiv arXiv 2014