Cross-Domain Transfer with Particle Physics Foundation Models: From Jets to Neutrino Interactions
Pith reviewed 2026-05-10 14:42 UTC · model grok-4.3
The pith
Pre-trained OmniLearned foundation model transfers from high-Q2 collisions to MINERvA neutrino data, outperforming scratch-trained models on regression and classification at fixed compute or steps.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Pre-trained OmniLearned models consistently outperform similarly sized models trained from scratch, achieving better overall performance at the same compute budget, as well as achieving better performance at the same number of training steps.
Load-bearing premise
The inductive biases acquired during pre-training on diverse high-Q2 simulated and real pp and ep collisions generalize to few-GeV fixed-target neutrino-nucleus scattering events processed from MINERvA data.
read the original abstract
Future AI-based studies in particle physics will likely start from a foundation model to accelerate training and enhance sensitivity. As a step towards a general-purpose foundation model for particle physics, we investigate whether the OmniLearned foundation model pre-trained on diverse high-$Q^2$ simulated and real $pp$ and $ep$ collisions can be effectively transferred to a few-GeV fixed-target neutrino experiment. We process MINERvA neutrino--nucleus scattering events and evaluate pre-trained models on two types of tasks: regression of available energy and binary classification of charged-current pion final states ($\mathrm{CC1\pi^{\pm}}$, $\mathrm{CCN\pi^{\pm}}$, and $\mathrm{CC1\pi^{0}}$). Pre-trained OmniLearned models consistently outperform similarly sized models trained from scratch, achieving better overall performance at the same compute budget, as well as achieving better performance at the same number of training steps. These results suggest that particle-level foundation models acquire inductive biases that generalize across large differences in energy scale, detector technology, and underlying physics processes, pointing toward a paradigm of detector-agnostic inference in particle physics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript investigates cross-domain transfer of the OmniLearned foundation model, pre-trained on diverse high-Q² simulated and real pp and ep collisions, to few-GeV fixed-target neutrino-nucleus scattering events from the MINERvA experiment. Events are processed for two downstream tasks: regression of available energy and binary classification of charged-current pion final states (CC1π±, CCNπ±, and CC1π⁰). The central empirical claim is that pre-trained models consistently outperform similarly sized models trained from scratch, both at equivalent compute budgets and at the same number of training steps, suggesting that inductive biases acquired during pre-training generalize across large differences in energy scale, detector technology, and underlying physics processes.
Significance. If the transfer results are shown to be robust, this would constitute concrete evidence that particle-physics foundation models can acquire reusable representations that bridge QCD-dominated high-energy collisions and weak-interaction neutrino-nucleus scattering. Such a result would support the broader program of detector-agnostic foundation models, potentially reducing training costs and improving sensitivity for neutrino experiments and other low-energy facilities. The work is timely given growing interest in foundation models within the field.
major comments (2)
- [Abstract / Results] Abstract and Results section: The headline claim of consistent outperformance is stated without any quantitative metrics, error bars, statistical significance tests, data-split details, or baseline specifications. This absence leaves the magnitude and reliability of the reported gains impossible to assess from the provided information.
- [§3] §3 (Data Processing and Input Representation): The manuscript does not describe how MINERvA events are normalized, scaled, or encoded into the OmniLearned input format, nor does it provide ablations (e.g., energy rescaling or comparison of learned embeddings) that would rule out confounds arising from the ~1000× energy-scale mismatch and differing detector responses. Without such checks, the observed transfer gains could be artifacts of preprocessing rather than transferable inductive bias.
minor comments (1)
- [Introduction] Clarify in the introduction or methods whether the pre-training corpus includes jet-specific observables or is limited to general collision kinematics, as the title references 'Jets' but the abstract emphasizes high-Q² pp/ep collisions more broadly.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.