Cross-Domain Transfer with Particle Physics Foundation Models: From Jets to Neutrino Interactions

Benjamin Nachman; Callum Wilkinson; Gregor Krzmanc; Vinicius Mikuni

arxiv: 2604.12364 · v1 · submitted 2026-04-14 · ✦ hep-ex · cs.LG· hep-ph· physics.data-an

Cross-Domain Transfer with Particle Physics Foundation Models: From Jets to Neutrino Interactions

Gregor Krzmanc , Vinicius Mikuni , Benjamin Nachman , Callum Wilkinson This is my paper

Pith reviewed 2026-05-10 14:42 UTC · model grok-4.3

classification ✦ hep-ex cs.LGhep-phphysics.data-an

keywords foundationmodelsphysicsparticlemathrmmodelpre-trainedachieving

0 comments

The pith

Pre-trained OmniLearned foundation model transfers from high-Q2 collisions to MINERvA neutrino data, outperforming scratch-trained models on regression and classification at fixed compute or steps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Researchers took a large AI model already trained on many kinds of high-energy particle collisions and tested whether it could be reused on data from a neutrino experiment that uses a different detector and much lower energies. They processed real and simulated neutrino events and asked the model to predict the available energy in each event or to classify whether the event contained a charged pion. The pre-trained model did better than a similar-sized model that started with random weights and was trained only on the neutrino data. The improvement held both when the total computing time was fixed and when the number of training steps was fixed. This indicates that the patterns the model learned from the first set of collisions are useful for understanding neutrino interactions even though the physics and hardware are quite different.

Core claim

Pre-trained OmniLearned models consistently outperform similarly sized models trained from scratch, achieving better overall performance at the same compute budget, as well as achieving better performance at the same number of training steps.

Load-bearing premise

The inductive biases acquired during pre-training on diverse high-Q2 simulated and real pp and ep collisions generalize to few-GeV fixed-target neutrino-nucleus scattering events processed from MINERvA data.

read the original abstract

Future AI-based studies in particle physics will likely start from a foundation model to accelerate training and enhance sensitivity. As a step towards a general-purpose foundation model for particle physics, we investigate whether the OmniLearned foundation model pre-trained on diverse high-$Q^2$ simulated and real $pp$ and $ep$ collisions can be effectively transferred to a few-GeV fixed-target neutrino experiment. We process MINERvA neutrino--nucleus scattering events and evaluate pre-trained models on two types of tasks: regression of available energy and binary classification of charged-current pion final states ($\mathrm{CC1\pi^{\pm}}$, $\mathrm{CCN\pi^{\pm}}$, and $\mathrm{CC1\pi^{0}}$). Pre-trained OmniLearned models consistently outperform similarly sized models trained from scratch, achieving better overall performance at the same compute budget, as well as achieving better performance at the same number of training steps. These results suggest that particle-level foundation models acquire inductive biases that generalize across large differences in energy scale, detector technology, and underlying physics processes, pointing toward a paradigm of detector-agnostic inference in particle physics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Pre-trained collider model beats scratch training on MINERvA tasks but the abstract gives no numbers or controls to judge how real the transfer is.

read the letter

The main point is that the authors take the OmniLearned model pre-trained on high-Q2 pp and ep collisions and fine-tune it on MINERvA neutrino-nucleus events, where it outperforms similarly sized models trained from scratch on energy regression and charged-current pion classification. The gains appear both at fixed compute budget and at fixed training steps. This is the first reported case of that specific cross-domain transfer in the cited literature. The work does well by running a concrete test of whether collider-derived inductive biases survive the jump to few-GeV fixed-target neutrino scattering and different detector technology. It supplies a practical example that foundation-model pre-training can accelerate downstream tasks in particle physics. The soft spots are straightforward. The abstract states consistent outperformance but reports no numerical values, error bars, statistical tests, baseline architectures, data splits, or input preprocessing details. Without those it is impossible to tell whether the improvement is large enough to matter or whether it partly reflects differences in how the MINERvA events were encoded. The roughly 1000x energy gap and shift from QCD to weak nuclear processes make that check important; an ablation on normalization or embedding comparison would have strengthened the claim. The paper is aimed at experimentalists and ML developers working on foundation models or neutrino analyses. A reader already thinking about transfer learning will find the direction useful even if the evidence remains preliminary. The thinking is coherent and the empirical direction is worth checking. I would send it to peer review so the full methods, numbers, and any additional controls can be examined.

Referee Report

2 major / 1 minor

Summary. The manuscript investigates cross-domain transfer of the OmniLearned foundation model, pre-trained on diverse high-Q² simulated and real pp and ep collisions, to few-GeV fixed-target neutrino-nucleus scattering events from the MINERvA experiment. Events are processed for two downstream tasks: regression of available energy and binary classification of charged-current pion final states (CC1π±, CCNπ±, and CC1π⁰). The central empirical claim is that pre-trained models consistently outperform similarly sized models trained from scratch, both at equivalent compute budgets and at the same number of training steps, suggesting that inductive biases acquired during pre-training generalize across large differences in energy scale, detector technology, and underlying physics processes.

Significance. If the transfer results are shown to be robust, this would constitute concrete evidence that particle-physics foundation models can acquire reusable representations that bridge QCD-dominated high-energy collisions and weak-interaction neutrino-nucleus scattering. Such a result would support the broader program of detector-agnostic foundation models, potentially reducing training costs and improving sensitivity for neutrino experiments and other low-energy facilities. The work is timely given growing interest in foundation models within the field.

major comments (2)

[Abstract / Results] Abstract and Results section: The headline claim of consistent outperformance is stated without any quantitative metrics, error bars, statistical significance tests, data-split details, or baseline specifications. This absence leaves the magnitude and reliability of the reported gains impossible to assess from the provided information.
[§3] §3 (Data Processing and Input Representation): The manuscript does not describe how MINERvA events are normalized, scaled, or encoded into the OmniLearned input format, nor does it provide ablations (e.g., energy rescaling or comparison of learned embeddings) that would rule out confounds arising from the ~1000× energy-scale mismatch and differing detector responses. Without such checks, the observed transfer gains could be artifacts of preprocessing rather than transferable inductive bias.

minor comments (1)

[Introduction] Clarify in the introduction or methods whether the pre-training corpus includes jet-specific observables or is limited to general collision kinematics, as the title references 'Jets' but the abstract emphasizes high-Q² pp/ep collisions more broadly.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that pre-training induces transferable inductive biases.

pith-pipeline@v0.9.0 · 5509 in / 1117 out tokens · 39309 ms · 2026-05-10T14:42:11.920363+00:00 · methodology

Cross-Domain Transfer with Particle Physics Foundation Models: From Jets to Neutrino Interactions

Core claim

Load-bearing premise

discussion (0)