pith. the verified trust layer for science. sign in

arxiv: 2510.07463 · v2 · submitted 2025-10-08 · ✦ hep-ex

Boosted decision tree reweighting of simulated neutrino interactions for O(1) GeV neutrino cross section measurements

Pith reviewed 2026-05-18 09:08 UTC · model grok-4.3

classification ✦ hep-ex
keywords neutrino Monte Carloboosted decision treereweightingneutrino cross sectionsMINERvAquasielastic interactionstransverse kinematic imbalance
0
0 comments X p. Extension

The pith

A boosted decision tree reweights neutrino Monte Carlo events to match a target generator's distributions and efficiencies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes a method for reweighting simulated O(1) GeV neutrino interaction events using a boosted decision tree. The tree learns from high-dimensional detector final-state observables to adjust events so that their reconstructed particle content and kinematics match those of a different target model. The reweighting also aligns the detector efficiency. This approach allows reuse of existing Monte Carlo samples instead of generating new ones for each model variation. It is illustrated with an application to transverse kinematic imbalance measurements in charged-current quasielastic-like events from the MINERvA experiment.

Core claim

The authors show that training a boosted decision tree on high-dimensional final-state observables enables the reweighting of events from one neutrino interaction generator to reproduce the reconstructed distributions and detector efficiencies of a target generator. This provides an efficient mechanism to adapt legacy Monte Carlo data for cross-section measurements without the computational cost of re-generation.

What carries the argument

Boosted decision tree trained on high-dimensional detector final-state observables for multi-dimensional event reweighting.

If this is right

  • Legacy Monte Carlo data can be reused for different neutrino interaction models.
  • Reconstructed kinematics and particle content can be matched across generators.
  • Detector efficiency corrections can be applied as part of the reweighting process.
  • The method supports specific physics analyses such as transverse kinematic imbalance in MINERvA data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If successful, this technique could lower the barrier to testing multiple models in neutrino cross-section studies by reducing simulation needs.
  • It opens the possibility of applying similar reweighting to other experiments or interaction types at similar energies.

Load-bearing premise

The boosted decision tree accurately captures all relevant differences between the source and target generators from the chosen high-dimensional observables without bias or omission of important effects.

What would settle it

Observing that the reweighted distributions deviate significantly from the target model in a validation sample of events or that the resulting cross-section measurement differs from one obtained with directly generated target events would falsify the claim.

Figures

Figures reproduced from arXiv: 2510.07463 by A. Klustov\'a, A.L. Hart, A. Lozano, A.M. Gago, A.V. Waldron, B. Yaeggy, C. J. Solano Salinas, C. Pernas, D.A. Harris, D. Last, D. Ruterbories, D. S. Correia, G.A. D\'iaz, G. Caceres, G.N. Perdue, H. Budd, H. Schellman, J. Felix, J. Kleykamp, J.K. Nelson, K.S. McFarland, L. Fields, L. Zazueta (The MINERvA Collaboration), M.A. Ram\'irez, M. Betancourt, M. Sultana, N.H. Vaughan, N. Roy, N.S. Alex, O. Moreno, P.K.Gaur, R. Gran, S. Akhter, S. Boyd, S. Manly, S.M. Gilligan, V. Paolone, W.A. Mann, X.-G. Lu, Z. Ahmad Dar, Z. Lin.

Figure 1
Figure 1. Figure 1: FIG. 1. Example of a decision tree splitting 100 source events [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Schematic illustration of the single-transverse kine [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Categorical histogram of MINERvA ME CCQE [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Differential cross-sections of categories “1p0n”, “1pNn”, “2pNn”, “2pNn”, and “others” combined are plotted with [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. Differential cross-sections of all categories combined are plotted with respect to calorimetric momenta [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6. Contour plot of proton detecting efficiency model [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7. 2D differential cross-sections with respect to [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: FIG. 8. 2D efficiency ratio plot. a [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: FIG. 9. Differential cross-sections of category 0p0n are plotted with respect to calorimetric momenta [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: FIG. 10. Differential cross-sections of category 0pNn are plotted with respect to calorimetric momenta [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: FIG. 11. Differential cross-sections of category 1p0n are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p013_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: FIG. 12. Differential cross-sections of category 1pNn are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: FIG. 13. Differential cross-sections of category 2p0n are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: FIG. 14. Differential cross-sections of category 2pNn are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p016_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: FIG. 15. Differential cross-sections of category are plotted with respect to leading proton [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗
read the original abstract

This paper illustrates a generic method for multi-dimensional reweighting of $O(1)$ GeV neutrino interaction Monte Carlo samples. The reweighting is based on a Boosted Decision Tree algorithm trained on high-dimensional space in detector final-state observables. This enables one generator's events to be reweighted so that its reconstructed particle content and kinematics distributions, as well as detector efficiency, match those of a target model. The approach establishes an efficient way to reuse legacy Monte Carlo data, avoiding re-generation. As an example, we test its use in a measurement of transverse kinematic imbalance of the $\mu^-$ and proton in charged-current quasielastic like $\nu_\mu$ events from the MINERvA experiment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a generic method for multi-dimensional reweighting of O(1) GeV neutrino interaction Monte Carlo samples using a boosted decision tree (BDT) trained on high-dimensional detector final-state observables. This allows events from one generator to be reweighted to reproduce the reconstructed particle content, kinematics distributions, and detector efficiency of a target model, with an example application to a measurement of transverse kinematic imbalance in charged-current quasielastic-like ν_μ events from the MINERvA experiment. The approach aims to enable reuse of legacy MC data without regeneration.

Significance. If the central claim holds with proper validation, the method offers a practical way to adapt existing Monte Carlo samples across generators for neutrino cross-section analyses, which could reduce the computational burden of large-scale event generation in experiments like MINERvA. The algorithmic nature avoids circularity in derivations, and the focus on reco-level observables directly addresses a common need in efficiency-corrected measurements.

major comments (2)
  1. [Abstract] Abstract and example application: the claim that BDT reweighting on reconstructed observables makes detector efficiency match the target generator is load-bearing for cross-section extraction but lacks quantitative validation. No metrics (e.g., efficiency ratios, response-matrix agreement, or before/after comparisons for the transverse kinematic imbalance observable) are reported to confirm that the true-to-reco mapping is preserved rather than altered by the post-reconstruction weights.
  2. [Example application] The weakest assumption—that the BDT trained on high-dimensional reco observables captures all relevant generator differences without biasing the physics result—is not tested against the skeptic concern. Different generators vary in true-level kinematics and interaction channels, so reco-only weights may change the effective efficiency for the target observable without reproducing the target generator's full response matrix.
minor comments (2)
  1. The manuscript would benefit from explicit discussion of how the BDT training avoids overtraining or extrapolation issues in sparsely populated regions of the high-dimensional reco space.
  2. Notation for the reweighting procedure (e.g., definition of the BDT output as a weight factor) should be clarified with a simple equation or pseudocode for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful and constructive review of our manuscript on BDT reweighting of neutrino Monte Carlo samples. The comments have prompted us to strengthen the quantitative validation in the example application and to clarify the assumptions and limitations of the method. We have revised the manuscript accordingly and believe these changes address the major concerns while preserving the core contribution.

read point-by-point responses
  1. Referee: [Abstract] Abstract and example application: the claim that BDT reweighting on reconstructed observables makes detector efficiency match the target generator is load-bearing for cross-section extraction but lacks quantitative validation. No metrics (e.g., efficiency ratios, response-matrix agreement, or before/after comparisons for the transverse kinematic imbalance observable) are reported to confirm that the true-to-reco mapping is preserved rather than altered by the post-reconstruction weights.

    Authors: We agree that explicit quantitative validation is required to support the efficiency-matching claim for cross-section work. In the revised manuscript we have added a dedicated validation subsection with efficiency ratios computed before and after reweighting, element-wise comparisons of the response matrices for the selected sample, and direct before/after distributions of the transverse kinematic imbalance observable. These metrics show that the reweighting reproduces the target generator’s efficiency to within a few percent across the relevant kinematic range while leaving the true-to-reco mapping statistically unchanged. revision: yes

  2. Referee: [Example application] The weakest assumption—that the BDT trained on high-dimensional reco observables captures all relevant generator differences without biasing the physics result—is not tested against the skeptic concern. Different generators vary in true-level kinematics and interaction channels, so reco-only weights may change the effective efficiency for the target observable without reproducing the target generator's full response matrix.

    Authors: We acknowledge the validity of this skeptic concern. While the high-dimensional reco-level training is designed to capture efficiency differences as they appear in the detector, it does not explicitly enforce agreement at true level. To address this, the revised manuscript now includes a closure test in which the reweighted sample is used to extract the transverse kinematic imbalance cross section and is compared directly to the result obtained with the target generator’s native Monte Carlo; the two agree within uncertainties. We have also expanded the discussion of limitations to note that the method assumes generator differences are dominantly reflected in the reconstructed observables for the chosen selection, and that additional checks would be needed for observables more sensitive to true-level modeling. revision: yes

Circularity Check

0 steps flagged

No circularity in algorithmic reweighting method

full rationale

The paper describes a BDT-based reweighting procedure trained on high-dimensional reconstructed observables to match particle content, kinematics, and detector efficiency between generators. No equations, derivations, or predictions are presented that reduce by construction to fitted inputs or self-citations. The approach is purely algorithmic and self-contained, with the central claim resting on the training and application steps that can be independently validated or falsified against external benchmarks rather than relying on internal redefinitions or load-bearing prior self-work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are stated. The method implicitly assumes that differences between generators are fully captured by the chosen high-dimensional observables and that the BDT generalizes without overfitting.

axioms (1)
  • domain assumption Differences between neutrino generators can be learned and corrected via supervised training on detector-level observables alone.
    Central to the reweighting approach described in the abstract.

pith-pipeline@v0.9.0 · 5871 in / 1149 out tokens · 24797 ms · 2026-05-18T09:08:45.610192+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages · 7 internal anchors

  1. [1]

    The GENIE Neutrino Monte Carlo Generator

    C. Andreopoulos et al., Nucl. Instrum. Meth.A614, 87 (2010), arXiv:0905.2517 [hep-ph]

  2. [2]

    The GENIE Neutrino Monte Carlo Generator: Physics and User Manual

    C. Andreopoulos, C. Barry, S. Dytman, H. Gallagher, T. Golan, R. Hatcher, G. Perdue, and J. Yarba, “The GENIE Neutrino Monte Carlo Generator: Physics and User Manual,” (2015), arXiv:1510.05494 [hep-ph]

  3. [3]

    Tena-Vidal et al

    J. Tena-Vidal et al. (GENIE), Phys. Rev. D106, 112001 (2022), arXiv:2206.11050 [hep-ph]

  4. [4]

    Hayato and L

    Y. Hayato and L. Pickering, The European Physical Journal Special Topics230, 4469–4481 (2021)

  5. [5]

    Golan, J

    T. Golan, J. Sobczyk, and J. ˙Zmuda, Nuclear Physics B - Proceedings Supplements229-232, 499 (2012), neutrino 2010

  6. [6]

    Isaacson, W

    J. Isaacson, W. I. Jay, A. Lovato, P. A. N. Machado, and N. Rocco, Phys. Rev. D107, 033007 (2023)

  7. [7]

    O. Buss, T. Gaitanos, K. Gallmeister, H. van Hees, 13 FIG. 11. Differential cross-sections of category 1p0n are plotted with respect to leading protonp x, py, pz (a, b, c); calorimetric energy P Tp (d);µ − py, pz (e, f); TKI variablesδp t, δαT , δϕT (g, h, i); and leading protonT p, θ(j, k). Histogram of weights (l) is plotted in log scale. Green: test sa...

  8. [8]

    Pickering, P

    L. Pickering, P. Stowell, and J. Sobczyk, Journal of Physics: Conference Series888, 012175 (2017)

  9. [9]

    Aliaga et al

    L. Aliaga et al. (MINERvA), Nucl. Instrum. Meth. A 743, 130 (2014)

  10. [10]

    Neutrino-nucleus interaction system- atic uncertainties and baseline model for dune anal- yses,

    L. Munteanu, “Neutrino-nucleus interaction system- atic uncertainties and baseline model for dune anal- yses,” Talk at CEWG Meeting Aug 15 2024, In- dico (2024),https://indico.fnal.gov/event/65637/ #3-neutrino-nucleus-interaction

  11. [11]

    A techni- cal note describing the ar23 genie tune is in preparation at the time of submission of this manuscript,

    S. Dolan, L. Munteanu, J. Kim, and R. Gran, “A techni- cal note describing the ar23 genie tune is in preparation at the time of submission of this manuscript,” (2025), placholder

  12. [12]

    Coadou, Yann, EPJ Web of Conferences55, 02004 (2013)

  13. [13]

    arogozhnikov/hep ml: hep ml v0.7.2,

    A. Rogozhnikov, A. Ustyuzhanin, J. Eschle, R. Lane, A. Pearce, A. Thada, K. Gizdov, K. Schubert, and S. Mejlgaard, “arogozhnikov/hep ml: hep ml v0.7.2,” (2023)

  14. [14]

    Rogozhnikov, J

    A. Rogozhnikov, J. Phys.: Conf. Ser.762, 012036 (2016)

  15. [15]

    Abratenko et al

    P. Abratenko et al. (MicroBooNE Collaboration), Phys. Rev. D111, 092010 (2025)

  16. [16]

    A. A. e. a. Abud and O. behalf of the DUNE Collabora- 14 FIG. 12. Differential cross-sections of category 1pNn are plotted with respect to leading protonp x, py, pz (a, b, c); calorimetric energy P Tp (d);µ − py, pz (e, f); leading neutronp x, py, pz (g, h, i); TKI variablesδp t, δαT , δϕT (j, k, l); and leading proton Tp, θ(m, n). Histogram of weights (o...

  17. [17]

    Deep Underground Neutrino Experiment (DUNE), Far Detector Technical Design Report, Volume II: DUNE Physics,

    B. A. et al and O. behalf of the DUNE Collaboration, “Deep underground neutrino experiment (dune), far de- tector technical design report, volume ii: Dune physics,” (2020), arXiv:2002.03005 [hep-ex]

  18. [18]

    M. A. Acero et al. (The NOvA Collaboration), Phys. Rev. D106, 032004 (2022)

  19. [19]

    J. H. Friedman, The Annals of Statistics29, 1189 (2001)

  20. [20]

    James, D

    G. James, D. Witten, T. Hastie, R. Tib- shirani, J. Taylor, and S. O. service), An Introduction to Statistical Learning with Applications in Python, 1st ed., Springer Texts in Statistics (Springer Interna- tional Publishing, Cham, 2023) pp. 332–335

  21. [21]

    X. G. Lu, L. Pickering, S. Dolan, G. Barr, D. Coplowe, Y. Uchida, D. Wark, M. O. Wascko, A. Weber, and T. Yuan, Phys. Rev. C94, 015503 (2016), arXiv:1512.05748 [nucl-th]

  22. [22]

    Cai et al

    T. Cai et al. (The MINERνA Collaboration), Phys. Rev. D101, 092001 (2020)

  23. [23]

    Brun and F

    R. Brun and F. Rademakers, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 389, 81 (1997), new Computing Techniques in Physics Research V

  24. [24]

    root- project/root: v6.18/02,

    R. Brun, F. Rademakers, P. Canal, A. Naumann, O. Couet, L. Moneta, V. Vassilev, S. Linev, D. Pi- paro, G. GANIS, B. Bellenot, E. Guiraud, G. Ama- dio, wverkerke, P. Mato, TimurP, M. Tadel, wlav, E. Tejedor, J. Blomer, A. Gheata, S. Hageboeck, S. Roiser, marsupial, S. Wunsch, O. Shadura, A. Bose, CristinaCristescu, X. Valls, and R. Isemann, “root- project/...

  25. [25]

    NUISANCE: a neutrino cross-section generator tuning and comparison framework

    P. Stowell et al., JINST12, P01016 (2017), arXiv:1612.07393 [hep-ex]

  26. [26]

    Zazueta et al., Physical Review D107(2023), 10.1103/physrevd.107.012001

    L. Zazueta et al., Physical Review D107(2023), 10.1103/physrevd.107.012001

  27. [27]

    Aliaga et al

    L. Aliaga et al. (MINERνA Collaboration), Phys. Rev. D94, 092005 (2016)

  28. [28]

    L. A. Harewood and R. Gran, arXiv preprint (2019), arXiv:1906.10576 [hep-ex]

  29. [29]

    W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing (Third Edition) (Cambridge University Press, 2007) pp. 736–737

  30. [30]

    I. Melo, B. Tom´ aˇ sik, G. Torrieri, S. Vogel, M. Bleicher, S. Kor´ ony, and M. c. v. Gintner, Phys. Rev. C80, 024904 (2009)

  31. [31]

    Inclusive Charged--Current Neutrino--Nucleus Reactions

    J. Nieves, I. Ruiz Simo, and M. Vicente Vacas, Phys. Rev. C83, 045501 (2011), arXiv:1102.2777 [hep- ph]

  32. [32]

    Extensions of Superscaling from Relativistic Mean Field Theory: the SuSAv2 Model

    R. Gonzal ˜A©z-Jim˜A©nez, G. D. Megias, M. B. Bar- baro, J. A. Caballero, and T. W. Donnelly, Phys. Rev. C90, 035501 (2014), arXiv:1407.8346 [nucl-th]