pith. sign in

arxiv: 2604.12364 · v1 · submitted 2026-04-14 · ✦ hep-ex · cs.LG· hep-ph· physics.data-an

Cross-Domain Transfer with Particle Physics Foundation Models: From Jets to Neutrino Interactions

Pith reviewed 2026-05-10 14:42 UTC · model grok-4.3

classification ✦ hep-ex cs.LGhep-phphysics.data-an
keywords foundationmodelsphysicsparticlemathrmmodelpre-trainedachieving
0
0 comments X

The pith

Pre-trained OmniLearned foundation model transfers from high-Q2 collisions to MINERvA neutrino data, outperforming scratch-trained models on regression and classification at fixed compute or steps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Researchers took a large AI model already trained on many kinds of high-energy particle collisions and tested whether it could be reused on data from a neutrino experiment that uses a different detector and much lower energies. They processed real and simulated neutrino events and asked the model to predict the available energy in each event or to classify whether the event contained a charged pion. The pre-trained model did better than a similar-sized model that started with random weights and was trained only on the neutrino data. The improvement held both when the total computing time was fixed and when the number of training steps was fixed. This indicates that the patterns the model learned from the first set of collisions are useful for understanding neutrino interactions even though the physics and hardware are quite different.

Core claim

Pre-trained OmniLearned models consistently outperform similarly sized models trained from scratch, achieving better overall performance at the same compute budget, as well as achieving better performance at the same number of training steps.

Load-bearing premise

The inductive biases acquired during pre-training on diverse high-Q2 simulated and real pp and ep collisions generalize to few-GeV fixed-target neutrino-nucleus scattering events processed from MINERvA data.

read the original abstract

Future AI-based studies in particle physics will likely start from a foundation model to accelerate training and enhance sensitivity. As a step towards a general-purpose foundation model for particle physics, we investigate whether the OmniLearned foundation model pre-trained on diverse high-$Q^2$ simulated and real $pp$ and $ep$ collisions can be effectively transferred to a few-GeV fixed-target neutrino experiment. We process MINERvA neutrino--nucleus scattering events and evaluate pre-trained models on two types of tasks: regression of available energy and binary classification of charged-current pion final states ($\mathrm{CC1\pi^{\pm}}$, $\mathrm{CCN\pi^{\pm}}$, and $\mathrm{CC1\pi^{0}}$). Pre-trained OmniLearned models consistently outperform similarly sized models trained from scratch, achieving better overall performance at the same compute budget, as well as achieving better performance at the same number of training steps. These results suggest that particle-level foundation models acquire inductive biases that generalize across large differences in energy scale, detector technology, and underlying physics processes, pointing toward a paradigm of detector-agnostic inference in particle physics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript investigates cross-domain transfer of the OmniLearned foundation model, pre-trained on diverse high-Q² simulated and real pp and ep collisions, to few-GeV fixed-target neutrino-nucleus scattering events from the MINERvA experiment. Events are processed for two downstream tasks: regression of available energy and binary classification of charged-current pion final states (CC1π±, CCNπ±, and CC1π⁰). The central empirical claim is that pre-trained models consistently outperform similarly sized models trained from scratch, both at equivalent compute budgets and at the same number of training steps, suggesting that inductive biases acquired during pre-training generalize across large differences in energy scale, detector technology, and underlying physics processes.

Significance. If the transfer results are shown to be robust, this would constitute concrete evidence that particle-physics foundation models can acquire reusable representations that bridge QCD-dominated high-energy collisions and weak-interaction neutrino-nucleus scattering. Such a result would support the broader program of detector-agnostic foundation models, potentially reducing training costs and improving sensitivity for neutrino experiments and other low-energy facilities. The work is timely given growing interest in foundation models within the field.

major comments (2)
  1. [Abstract / Results] Abstract and Results section: The headline claim of consistent outperformance is stated without any quantitative metrics, error bars, statistical significance tests, data-split details, or baseline specifications. This absence leaves the magnitude and reliability of the reported gains impossible to assess from the provided information.
  2. [§3] §3 (Data Processing and Input Representation): The manuscript does not describe how MINERvA events are normalized, scaled, or encoded into the OmniLearned input format, nor does it provide ablations (e.g., energy rescaling or comparison of learned embeddings) that would rule out confounds arising from the ~1000× energy-scale mismatch and differing detector responses. Without such checks, the observed transfer gains could be artifacts of preprocessing rather than transferable inductive bias.
minor comments (1)
  1. [Introduction] Clarify in the introduction or methods whether the pre-training corpus includes jet-specific observables or is limited to general collision kinematics, as the title references 'Jets' but the abstract emphasizes high-Q² pp/ep collisions more broadly.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities; the central claim rests on the unstated premise that pre-training induces transferable inductive biases.

pith-pipeline@v0.9.0 · 5509 in / 1117 out tokens · 39309 ms · 2026-05-10T14:42:11.920363+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.