arxiv: 2604.02415 · v2 · submitted 2026-04-02 · ✦ hep-ph · cs.AI

Recognition: 3 theorem links

· Lean Theorem

Generative models on phase space

Zachary Bogorad , Ibrahim Elsharkawy , Yonatan Kahn , Andrew J. Larkoski , Noam Levi

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:49 UTC · model grok-4.3

classification ✦ hep-ph cs.AI

keywords generative modelsdiffusion modelsphase spaceparticle physicsLorentz invariancehigh-energy physicsjet datamachine learning

0 comments

The pith

Generative models for particle physics data stay exactly on the physical phase space manifold at every sampling step.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces diffusion and flow matching models that generate collections of particle four-momenta while remaining confined by construction to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame. This exact confinement enforces energy and momentum conservation throughout the entire generative trajectory rather than learning the constraints only approximately. For diffusion models the forward process ends at the uniform distribution over this manifold, providing a clear baseline from which particle correlations develop in the reverse process. The authors show that the models can learn both few-particle and many-particle distributions containing various singularity structures.

Core claim

Generative models can be constructed so that every step of the sampling trajectory lies on the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame, thereby satisfying physical priors such as energy and momentum conservation exactly rather than approximately.

What carries the argument

The manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame, which acts as the exact constraint surface that the generative process never leaves.

If this is right

Diffusion models begin the reverse process from the uniform distribution on the phase space manifold.
The models reproduce distributions for both small and large numbers of particles that include multiple singularity structures.
Exact constraint satisfaction improves reliability and interpretability of generated events compared with models that learn constraints only approximately.
The approach supports future interpretability studies on simulated jet data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The uniform starting distribution on phase space could serve as a reference point for measuring how physical structures emerge in other constrained generative tasks.
Exact manifold confinement might be adapted to other domains that require strict conservation laws, such as molecular conformation sampling.
The method could reduce post-generation corrections in Monte Carlo event generators by eliminating unphysical samples at the source.

Load-bearing premise

The exact manifold constraint can be maintained throughout training and sampling without preventing the model from accurately reproducing target distributions that contain various singularity structures.

What would settle it

Generating samples from the trained model and checking whether total four-momentum is exactly conserved while the distribution of pairwise angles or energies deviates from a known target jet distribution with collinear singularities.

Figures

Figures reproduced from arXiv: 2604.02415 by Andrew J. Larkoski, Ibrahim Elsharkawy, Noam Levi, Yonatan Kahn, Zachary Bogorad.

**Figure 2.** Figure 2: FIG. 2: Distributions of the logarithm of the theoretical Dalitz plot PDF for the true muon decay distribution and [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3: Energy distributions for the muon decay matrix element. [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4: Distributions of the Rosenblatt transformation parameters [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5: Angular distributions for the muon decay matrix element. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6: Dalitz plots of 500,000 samples from the [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7: As in Fig. 1, but for the [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: FIG. 8: Single-particle energy distributions comparing diffusion model samples to the ground truth distribution of [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: FIG. 9: Single-particle angular distributions comparing diffusion model samples to the ground truth distribution of [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: FIG. 10: ( [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: FIG. 11: ( [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗

**Figure 12.** Figure 12: FIG. 12: Distributions of [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗

**Figure 13.** Figure 13: FIG. 13: Comparison of flow matching models in [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗

**Figure 14.** Figure 14: FIG. 14: Per-event energy and momentum violation normalized by the median single-particle energy [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗

**Figure 15.** Figure 15: FIG. 15: Energy (left) and angular (center and right) distributions of particle 3 for varying data augmentation [PITH_FULL_IMAGE:figures/full_fig_p019_15.png] view at source ↗

**Figure 16.** Figure 16: FIG. 16: Training data in [PITH_FULL_IMAGE:figures/full_fig_p020_16.png] view at source ↗

**Figure 17.** Figure 17: FIG. 17: Distributions of the logarithm of the theoretical Dalitz plot PDF for the true muon decay distribution and [PITH_FULL_IMAGE:figures/full_fig_p021_17.png] view at source ↗

**Figure 18.** Figure 18: FIG. 18: Rosenblatt transformations as in Fig. 4, but for diffusion models trained with [PITH_FULL_IMAGE:figures/full_fig_p021_18.png] view at source ↗

**Figure 19.** Figure 19: shows the beginning and end of the forward process for these three different schedules and datasets. The shorter schedule does not fully equilibrate in q-space, while the longer schedule (for both training set choices) is a [PITH_FULL_IMAGE:figures/full_fig_p022_19.png] view at source ↗

**Figure 20.** Figure 20: FIG. 20: Reverse trajectory for the [PITH_FULL_IMAGE:figures/full_fig_p023_20.png] view at source ↗

**Figure 21.** Figure 21: FIG. 21: Snapshots from the reverse process in [PITH_FULL_IMAGE:figures/full_fig_p023_21.png] view at source ↗

**Figure 22.** Figure 22: FIG. 22: Forward process start and end distributions in [PITH_FULL_IMAGE:figures/full_fig_p024_22.png] view at source ↗

read the original abstract

Deep generative models such as diffusion and flow matching are powerful machine learning tools capable of learning and sampling from high-dimensional distributions. They are particularly useful when the training data appears to be concentrated on a submanifold of the data embedding space. For high-energy physics data, consisting of collections of relativistic energy-momentum 4-vectors, this submanifold can enforce extremely strong physically-motivated priors, such as energy and momentum conservation. If these constraints are learned only approximately, rather than exactly, this can inhibit the interpretability and reliability of such generative models. To remedy this deficiency, we introduce generative models which are, by construction, confined at every step of their sampling trajectory to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame. In the case of diffusion models, the "pure noise" forward process endpoint corresponds to the uniform distribution on phase space, which provides a clear starting point from which to identify how correlations among the particles emerge during the reverse (de-noising) process. We demonstrate that our models are able to learn both few-particle and many-particle distributions with various singularity structures, paving the way for future interpretability studies using generative models trained on simulated jet data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a clean geometric construction for diffusion and flow models that stay exactly on the massless N-particle Lorentz-invariant phase space manifold, but the demonstrations still need quantitative checks on singularity reproduction.

read the letter

The core advance is a direct construction that keeps the entire sampling trajectory of diffusion and flow models inside the exact Lorentz-invariant phase space for massless particles in the center-of-momentum frame. The forward process ends at the uniform measure on that manifold rather than generic noise, so conservation laws are satisfied by geometry at every step instead of being learned approximately. That setup is new in the HEP generative modeling literature and gives a principled starting point for watching how particle correlations appear during the reverse process. They show the approach works on both few-particle and many-particle cases that include soft and collinear singularities, which is the right test bed. The geometric prior is handled without extra parameters or post-hoc projections, and the uniform endpoint is a nice interpretive feature. The main limitation is that the current evidence is mostly qualitative. The abstract and demonstrations do not yet include error bars, singularity exponent fits, or direct comparisons against unconstrained baselines on the same singular measures. It is still possible that the manifold restriction smooths or suppresses the singular regions even if the bulk distribution looks reasonable. Until those metrics appear, the claim that the models faithfully learn the target distributions with singularities rests on visual inspection rather than falsifiable checks. This paper is aimed at people building event generators who already care about exact conservation laws and interpretability. A reader working on manifold-constrained generative models will find the construction useful even if they skip the HEP specifics. It is worth sending to peer review because the central idea is technically sound and the potential payoff for reliable simulation is clear; the gaps are fixable with more quantitative results rather than fundamental flaws in the approach.

Referee Report

1 major / 1 minor

Summary. The paper introduces diffusion and flow-matching generative models that are confined by construction to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame. The forward process terminates at the uniform measure on this manifold, and the models are claimed to learn and sample from target distributions containing soft and collinear singularities for both few- and many-particle cases.

Significance. If the central claim holds, the work provides a meaningful advance in constrained generative modeling for high-energy physics by enforcing exact physical priors (energy-momentum conservation and on-shell conditions) without learned approximations. The parameter-free geometric construction and the explicit uniform-phase-space starting point are clear strengths that could support future interpretability analyses of correlation emergence.

major comments (1)

[Abstract] Abstract: the claim of successful demonstrations on distributions with various singularity structures is stated without quantitative metrics, error analysis, or comparisons of singular exponents between model and target, leaving the expressivity of the exactly constrained reverse process unverified.

minor comments (1)

Notation for the phase-space manifold and the precise definition of the forward-process endpoint could be made more explicit to aid reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive evaluation of the work and for highlighting a useful point about the abstract. We address the comment below and will revise the manuscript to incorporate quantitative elements where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract: the claim of successful demonstrations on distributions with various singularity structures is stated without quantitative metrics, error analysis, or comparisons of singular exponents between model and target, leaving the expressivity of the exactly constrained reverse process unverified.

Authors: We agree that the abstract, as a high-level summary, does not include explicit numerical metrics. The main text already presents quantitative evidence for the model's performance, including direct comparisons of generated and target distributions for both few- and many-particle cases, error analyses on energy-momentum conservation residuals, and visual/quantitative assessments of how singular structures (soft and collinear) are reproduced. In the revised version we will update the abstract to reference these results more explicitly, for example by noting the level of agreement achieved on singular exponents and the scale of the error metrics shown in the figures. This change will better convey the expressivity of the exactly constrained reverse process without altering the technical claims. revision: yes

Circularity Check

0 steps flagged

Direct geometric construction using known physical priors; no circularity

full rationale

The paper's central claim is a direct geometric construction that enforces the massless N-particle Lorentz-invariant phase space manifold (center-of-momentum frame) exactly at every sampling step by construction, drawing on standard physical priors such as energy-momentum conservation and Lorentz invariance. This does not reduce to any fitted parameter, self-referential definition, or load-bearing self-citation chain; the forward process endpoint is the uniform measure on that manifold, and the reverse process is constrained to remain on it without deriving the constraint from the target distribution itself. No steps match the enumerated circularity patterns, and the approach remains self-contained against external benchmarks like known phase-space measures.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard physical definition of massless N-particle phase space in the center-of-momentum frame; no free parameters, new entities, or ad-hoc axioms are introduced beyond domain-standard assumptions.

axioms (1)

domain assumption The phase space of massless N-particle systems in the center-of-momentum frame is a well-defined manifold on which uniform distributions and sampling trajectories can be defined.
This is the geometric prior used to confine the generative process at every step.

pith-pipeline@v0.9.0 · 5518 in / 1209 out tokens · 48339 ms · 2026-05-13T20:49:42.540639+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

generative models which are, by construction, confined at every step of their sampling trajectory to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame
IndisputableMonolith/Cost Jcost_unit0 echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

the 'pure noise' forward process endpoint corresponds to the uniform distribution on phase space
IndisputableMonolith/Foundation/reality_from_one_distinction reality_from_one_distinction echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

RAMBO algorithm ... trades the constraints of phase space for a non-uniform distribution in an auxiliary space, q-space

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

77 extracted references · 77 canonical work pages · 10 internal anchors

[1]

Start from a pointP 0 in phase space, and map it to a pointQ 0(P0,b, x) inq-space as described in Sec. II C. There are many possibilities for choosing this transformation (b, x), which we describe further below

work page
[2]

copies” the phase space submanifoldN mult times intoq-space; 7 •using adifferent(b, x) for each pointPto continously “fill out

Implement the Langevin dynamics inq-space, Qt+1 =Q t +γ t∇logp ref(Qt) + p 2γtZt, t= 0,1, . . . T,(10) whereZ t ∼ N(0, I 3N×3N ) is isotropic Gaussian noise inR 3N,γ t is a fixed noise schedule, and∇logp ref is given in Eq. (9). At any timestept,RAMBOgives a unique, well-defined mappingQ t →P t whereP t lives onN-particle phase space, and thus we can map ...

work page
[3]

Diffusion time is encoded through sinusoidal embeddings with dimension 64, appended to the 9-dimensional input for 3-particleq-space

3-particle distributions Our diffusion model score network consists of a 4-layer multilayer perceptron (MLP) withSiLUactivations and hidden width 256, using defaultPyTorchinitializations for the inner layers but initializing the last-layer parameters to zero. Diffusion time is encoded through sinusoidal embeddings with dimension 64, appended to the 9-dime...

work page
[4]

interpretingp-space vectors directly asq-space vectors)

no augmentation or transformation (i.e. interpretingp-space vectors directly asq-space vectors)

work page
[5]

a 2-fold augmentationN mult = 2

work page
[6]

16: Training data inq-space for the different data augmentation strategies in Fig

a larger augmentationN mult = 10; 20 0 1 2 3 4 5 6 7 8 q3 10−2 10−1 100 Density Muon decay, q-space distribution Case 1 Case 2 Case 3 Case 4 main text pref(Q) FIG. 16: Training data inq-space for the different data augmentation strategies in Fig. 15. ObservableN mult = 1N mult = 2 E1 0.73×10 −3 2.4×10 −3 E2 0.35×10 −3 1.2×10 −3 E3 1.6×10 −3 5.7×10 −3 ln p...

work page
[7]

FID inv” denotes a Fr´ echet distance computed on Lorentz-invariant features, while “FIDAE

a continuous transformation where eachp-space point is boosted and rescaled by adifferentconformal trans- formation (b, x). All models were trained with the same totalq-space training set size of 500,000, with all other hyperparameters the same as described above. The energy distributions are clearly a poor match to the target, with all but the default st...

work page arXiv 2000
[8]

Higher-dimensional distributions a. Network.For distributions withN≥3 particles, we replace the MLP score network with aPoint Edge Transformer(PET) as described in [25] due to the architecture’s track record for jet tasks and its ability to generalize beyond what it was originally designed for [66, 67]. The PET is a transformer that treats eachq-space eve...

work page
[9]

HEP ML Community, A Living Review of Machine Learning for Particle Physics,https://iml-wg.github.io/ HEPML-LivingReview/

work page
[10]

Fefferman, S

C. Fefferman, S. Mitter, and H. Narayanan, Testing the manifold hypothesis, Journal of the American Mathematical Society29, 983 (2013), arXiv:1310.0425

work page arXiv 2013
[11]

Unsupervised feature learning and deep learning: A review and new perspectives.CoRR, abs/1206.5538, 1(2665):2012, 2012

Y. Bengio, A. Courville, and P. Vincent, Representation learning: A review and new perspectives, IEEE transactions on pattern analysis and machine intelligence35, 1798 (2014), arXiv:1206.5538 [cs.LG]

work page arXiv 2014
[12]

P. P. Brahma, D. O. Wu, and Y. She, Why deep learning works: A manifold disentanglement perspective, IEEE Transac- tions on Neural Networks and Learning Systems27, 1997 (2016)

work page 1997
[13]

Deep Unsupervised Learning using Nonequilibrium Thermodynamics

J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, Deep Unsupervised Learning using Nonequilibrium Thermodynamics, ICML (2015), arXiv:1503.03585 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2015
[14]

Generative modeling by estimating gradients of the data distribu- tion.arXiv preprint arXiv:1907.05600,

Y. Song and S. Ermon, Generative Modeling by Estimating Gradients of the Data Distribution, NeurIPS (2019), arXiv:1907.05600 [cs.LG]

work page arXiv 2019
[15]

J. Ho, A. Jain, and P. Abbeel, Denoising Diffusion Probabilistic Models, NeurIPS (2020), arXiv:2006.11239 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2020
[16]

Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-Based Generative Modeling through Stochastic Differential Equations, ICLR (2021), arXiv:2011.13456 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2021
[17]

Flow Matching for Generative Modeling

Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, and M. Nickel, Flow Matching for Generative Modeling, ICLR (2023), arXiv:2210.02747 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2023
[18]

Flow Matching Guide and Code

Y. Lipman, M. Havasi, P. Holderrieth, N. Shaul, M. Le, B. Karrer, R. T. Q. Chen, D. Lopez-Paz, H. Ben-Hamu, and I. Gat, Flow matching guide and code, (2024), arXiv:2412.06264 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2024
[19]

A. Tong, K. Fatras, N. Malkin, G. Huguet, Y. Zhang, J. Rector-Brooks, G. Wolf, and Y. Bengio, Improving and generalizing flow-based generative models with minibatch optimal transport, Transactions on Machine Learning Research (2024), arXiv:2302.00482 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2024
[20]

C.-H. Lai, Y. Song, D. Kim, Y. Mitsufuji, and S. Ermon, The principles of diffusion models, (2025), arXiv:2510.21890 [cs.LG]

work page arXiv 2025
[21]

Leigh, D

M. Leigh, D. Sengupta, G. Qu´ etant, J. A. Raine, K. Zoch, and T. Golling, PC-JeDi: Diffusion for particle cloud generation in high energy physics, SciPost Phys.16, 018 (2024), arXiv:2303.05376 [hep-ph]

work page arXiv 2024
[22]

Mikuni, B

V. Mikuni, B. Nachman, and M. Pettee, Fast point cloud generation with diffusion models in high energy physics, Phys. Rev. D108, 036025 (2023), arXiv:2304.01266 [hep-ph]

work page arXiv 2023
[23]

Butter, N

A. Butter, N. Huetsch, S. P. Schweitzer, T. Plehn, P. Sorrenson, and J. Spinner, Jet Diffusion versus JetGPT – Modern Networks for the LHC, SciPost Phys.Core8, 026 (2023), arXiv:2305.10475 [hep-ph]

work page arXiv 2023
[24]

Leigh, D

M. Leigh, D. Sengupta, J. A. Raine, G. Qu´ etant, and T. Golling, PC-Droid: Faster diffusion and improved quality for particle cloud generation, Phys.Rev.D109, 012010 (2023), arXiv:2307.06836 [hep-ex]

work page arXiv 2023
[25]

Buhmann, C

E. Buhmann, C. Ewen, D. A. Faroughy, T. Golling, G. Kasieczka, M. Leigh, G. Qu´ etant, J. A. Raine, D. Sengupta, and D. Shih, EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion, (2023), arXiv:2310.00049 [hep-ph]. 27

work page arXiv 2023
[26]

Butter, T

A. Butter, T. Jezo, M. Klasen, M. Kuschick, S. Palacios Schweitzer, and T. Plehn, Kicking it off(-shell) with direct diffusion, SciPost Phys. Core7, 064 (2024), arXiv:2311.17175 [hep-ph]

work page arXiv 2024
[27]

Sengupta, M

D. Sengupta, M. Leigh, J. A. Raine, S. Klein, and T. Golling, Improving new physics searches with diffusion models for event observables and jet constituents, JHEP04, 109, arXiv:2312.10130 [physics.data-an]

work page arXiv
[28]

Qu´ etant, J

G. Qu´ etant, J. A. Raine, M. Leigh, D. Sengupta, and T. Golling, Generating variable length full events from partons, Phys. Rev. D110, 076023 (2024), arXiv:2406.13074 [hep-ph]

work page arXiv 2024
[29]

Jiang, S

C. Jiang, S. Qian, and H. Qu, Choose your diffusion: Efficient and flexible ways to accelerate the diffusion model in fast high energy physics simulation, SciPost Phys.18, 195 (2025), arXiv:2401.13162 [physics.ins-det]

work page arXiv 2025
[30]

Mikuni and B

V. Mikuni and B. Nachman, Method to simultaneously facilitate all jet physics tasks, Phys. Rev. D111, 054015 (2025), arXiv:2502.14652 [hep-ph]

work page arXiv 2025
[31]

Dreyer, E

E. Dreyer, E. Gross, D. Kobylianskii, V. Mikuni, and B. Nachman, Conditional deep generative models for simultaneous simulation and reconstruction of entire events, Phys. Rev. D113, 032005 (2026), arXiv:2503.19981 [hep-ex]

work page arXiv 2026
[32]

D. A. Faroughy, M. Opper, and C. Ojeda, Multimodal Generative Flows for LHC Jets, in39th Annual Conference on Neural Information Processing Systems: Includes Machine Learning and the Physical Sciences (ML4PS)(2025) arXiv:2509.01736 [hep-ph]

work page arXiv 2025
[33]

Bhimji, C

W. Bhimji, C. Harris, V. Mikuni, and B. Nachman, OmniLearned: A Multi-Task Foundation Model for Jet Physics, (2025), arXiv:2510.24066 [hep-ph]

work page arXiv 2025
[34]

S. Gong, Q. Meng, J. Zhang, H. Qu, C. Li, S. Qian, W. Du, Z.-M. Ma, and T.-Y. Liu, An efficient Lorentz equivariant graph neural network for jet tagging, JHEP07, 030, arXiv:2201.08187 [hep-ph]

work page arXiv
[35]

Bogatskiy, T

A. Bogatskiy, T. Hoffman, D. W. Miller, and J. T. Offermann, PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics, (2022), arXiv:2211.00454 [hep-ph]

work page arXiv 2022
[36]

Z. Hao, R. Kansal, J. Duarte, and N. Chernyavskaya, Lorentz group equivariant autoencoders, Eur. Phys. J. C83, 485 (2023), arXiv:2212.07347 [hep-ex]

work page arXiv 2023
[37]

Bogatskiy, T

A. Bogatskiy, T. Hoffman, D. W. Miller, J. T. Offermann, and X. Liu, Explainable equivariant neural networks for particle physics: PELICAN, JHEP03, 113, arXiv:2307.16506 [hep-ph]

work page arXiv
[38]

Spinner, V

J. Spinner, V. Bres´ o, P. de Haan, T. Plehn, J. Thaler, and J. Brehmer, Lorentz-Equivariant Geometric Algebra Trans- formers for High-Energy Physics, in38th conference on Neural Information Processing Systems(2024) arXiv:2405.14806 [physics.data-an]

work page arXiv 2024
[39]

Spinner, L

J. Spinner, L. Favaro, P. Lippmann, S. Pitz, G. Gerhartz, T. Plehn, and F. A. Hamprecht, Lorentz Local Canonicalization: How to Make Any Network Lorentz-Equivariant, (2025), arXiv:2505.20280 [stat.ML]

work page arXiv 2025
[40]

Hebbar, T

P. Hebbar, T. Madula, V. Mikuni, B. Nachman, N. Outmezguine, and I. Savoray, SEAL - A Symmetry EncourAging Loss for High Energy Physics, (2025), arXiv:2511.01982 [hep-ph]

work page arXiv 2025
[41]

Y. Xu, Z. Liu, M. Tegmark, and T. Jaakkola, Poisson flow generative models, Advances in Neural Information Processing Systems35, 16782 (2022), arXiv:2209.11178 [cs.LG]

work page arXiv 2022
[42]

Y. Xu, Z. Liu, Y. Tian, S. Tong, M. Tegmark, and T. Jaakkola, PFGM++: Unlocking the potential of physics-inspired generative models, inInternational Conference on Machine Learning(PMLR, 2023) pp. 38566–38591, arXiv:2302.04265 [cs.LG]

work page arXiv 2023
[43]

De Bortoli, E

V. De Bortoli, E. Mathieu, M. Hutchinson, J. Thornton, Y. Whye Teh, and A. Doucet, Riemannian Score-Based Generative Modelling, Advances in neural information processing systems35, 2406 (2022), arXiv:2202.02763 [cs.LG]

work page arXiv 2022
[44]

Diffusion Processes on Implicit Manifolds

V. Kawasaki-Borruat, C. Grotehans, P. Vandergheynst, and A. Gosztolai, Diffusion processes on implicit manifolds, (2026), arXiv:2604.07213 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2026
[45]

Kleiss, W

R. Kleiss, W. J. Stirling, and S. D. Ellis, A New Monte Carlo Treatment of Multiparticle Phase Space at High-energies, Comput. Phys. Commun.40, 359 (1986)

work page 1986
[46]

Nachman and R

B. Nachman and R. Winterhalder, Elsa: enhanced latent spaces for improved collider simulations, Eur. Phys. J. C83, 843 (2023), arXiv:2305.07696 [hep-ph]

work page arXiv 2023
[47]

Heimel, O

T. Heimel, O. Mattelaer, T. Plehn, and R. Winterhalder, Differentiable MadNIS-Lite, SciPost Phys.18, 017 (2025), arXiv:2408.01486 [hep-ph]

work page arXiv 2025
[48]

P. T. Komiske, E. M. Metodiev, and J. Thaler, The Hidden Geometry of Particle Collisions, JHEP07, 006, arXiv:2004.04159 [hep-ph]

work page arXiv 2004
[49]

Hsieh, A

Y.-P. Hsieh, A. Kavis, P. Rolland, and V. Cevher, Mirrored langevin dynamics, Advances in Neural Information Processing Systems31(2018), arXiv:1802.10174 [cs.LG]

work page arXiv 2018
[50]

Hyv¨ arinen and P

A. Hyv¨ arinen and P. Dayan, Estimation of non-normalized statistical models by score matching., Journal of Machine Learning Research6(2005)

work page 2005
[51]

Z. Wang, P. Wang, K. Liu, P. Wang, Y. Fu, C.-T. Lu, C. C. Aggarwal, J. Pei, and Y. Zhou, A comprehensive survey on data augmentation, (2025), arXiv:2405.09591 [cs.LG]

work page arXiv 2025
[52]

Batson, C

J. Batson, C. G. Haaf, Y. Kahn, and D. A. Roberts, Topological Obstructions to Autoencoding, JHEP04, 280, arXiv:2102.08380 [hep-ph]

work page arXiv
[53]

Rosenblatt, Remarks on a Multivariate Transformation, Annals of Mathematical Statistics23, 470 (1952)

M. Rosenblatt, Remarks on a Multivariate Transformation, Annals of Mathematical Statistics23, 470 (1952)

work page 1952
[54]

The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations

J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao, T. Stelzer, P. Torrielli, and M. Zaro, The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations, JHEP07, 079, arXiv:1405.0301 [hep-ph]

work page internal anchor Pith review Pith/arXiv arXiv
[55]

Brandt, C

S. Brandt, C. Peyrou, R. Sosnowski, and A. Wroblewski, The Principal axis of jets. An Attempt to analyze high-energy collisions as two-body processes, Phys. Lett.12, 57 (1964)

work page 1964
[56]

Farhi, A QCD Test for Jets, Phys

E. Farhi, A QCD Test for Jets, Phys. Rev. Lett.39, 1587 (1977). 28

work page 1977
[57]

R. K. Ellis, W. J. Stirling, and B. R. Webber,QCD and collider physics, Vol. 8 (Cambridge University Press, 2011)

work page 2011
[58]

S. J. Parke and T. R. Taylor, An Amplitude fornGluon Scattering, Phys. Rev. Lett.56, 2459 (1986)

work page 1986
[59]

F. A. Berends and W. T. Giele, Recursive Calculations for Processes with n Gluons, Nucl. Phys. B306, 759 (1988)

work page 1988
[60]

P. D. Draggiotis, A. van Hameren, and R. Kleiss, SARGE: An Algorithm for generating QCD antennas, Phys. Lett. B 483, 124 (2000), arXiv:hep-ph/0004047

work page internal anchor Pith review Pith/arXiv arXiv 2000
[61]

A. J. Larkoski and E. M. Metodiev, A Theory of Quark vs. Gluon Discrimination, JHEP10, 014, arXiv:1906.01639 [hep-ph]

work page arXiv 1906
[62]

Roscher, B

R. Roscher, B. Bohn, M. F. Duarte, and J. Garcke, Explainable machine learning for scientific insights and discoveries, IEEE Access8, 42200 (2020), arXiv:1905.08883

work page arXiv 2020
[63]

Krenn, R

M. Krenn, R. Pollice, S. Y. Guo, M. Aldeghi, A. Cervera-Lierta, P. Friederich, G. dos Passos Gomes, F. H¨ ase, A. Jinich, A. Nigam, Z. Yao, and A. Aspuru-Guzik, On scientific understanding with artificial intelligence, Nature Reviews Physics 4, 761 (2022), arXiv:2204.01467

work page arXiv 2022
[64]

Gambhir, M

R. Gambhir, M. LeBlanc, and Y. Zhou, The Pareto Frontier of Resilient Jet Tagging, in39th Annual Conference on Neural Information Processing Systems: Includes Machine Learning and the Physical Sciences (ML4PS)(2025) arXiv:2509.19431 [hep-ph]

work page arXiv 2025
[65]

Cagnetta, L

F. Cagnetta, L. Petrini, U. M. Tomasini, A. Favero, and M. Wyart, How deep neural networks learn compositional data: The random hierarchy model, Physical Review X14, 031001 (2024), arXiv:2307.02129

work page arXiv 2024
[66]

Sclocchi, A

A. Sclocchi, A. Favero, and M. Wyart, A phase transition in diffusion models reveals the hierarchical nature of data, Proceedings of the National Academy of Sciences122, e2408799121 (2025), arXiv:2402.16991

work page arXiv 2025
[67]

Sclocchi, A

A. Sclocchi, A. Favero, N. I. Levi, and M. Wyart, Probing the latent hierarchical structure of data via diffusion models, Journal of Statistical Mechanics: Theory and Experiment , 084005 (2025), arXiv:2410.13770

work page arXiv 2025
[68]

How compositional generalization and creativity improve as diffusion models are trained.arXiv preprint arXiv:2502.12089, 2025

A. Favero, A. Sclocchi, F. Cagnetta, P. Frossard, and M. Wyart, How compositional generalization and creativity improve as diffusion models are trained, inProceedings of the 42nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 267 (PMLR, 2025) pp. 16286–16306, arXiv:2502.12089

work page arXiv 2025
[69]

Y. Han, A. Han, W. Huang, C. Lu, and D. Zou, Can diffusion models learn hidden inter-feature rules behind images?, in Proceedings of the 42nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 267 (PMLR, 2025) pp. 21704–21732, arXiv:2502.04725

work page arXiv 2025
[70]

N. I. Levi and Y. Oz, The underlying universal statistical structure of natural datasets, inForty-second International Conference on Machine Learning(2025) arXiv:2306.14975

work page arXiv 2025
[71]

Levi and Y

N. Levi and Y. Oz, The universal statistical structure and scaling laws of chaos and turbulence, (2023), arXiv:2311.01358 [cond-mat.stat-mech]

work page arXiv 2023
[72]

P. T. Komiske, R. Mastandrea, E. M. Metodiev, P. Naik, and J. Thaler, Exploring the Space of Jets with CMS Open Data, Phys. Rev. D101, 034009 (2020), arXiv:1908.08542 [hep-ph]

work page arXiv 2020
[73]

Breso-Pla, K

V. Breso-Pla, K. Greif, V. Mikuni, B. Nachman, T. Plehn, T. Wamorkar, and D. Whiteson, Explicit or Implicit? Encoding Physics at the Precision Frontier, (2026), arXiv:2603.08802 [hep-ph]

work page arXiv 2026
[74]

Mikuni, I

V. Mikuni, I. Elsharkawy, and B. Nachman, OmniCosmos: Transferring Particle Physics Knowledge Across the Cosmos, (2025), arXiv:2512.24422 [astro-ph.CO]

work page arXiv 2025
[75]

OmniMol: Transferring Particle Physics Knowledge to Molecular Dynamics with Point-Edge Transformers

I. Elsharkawy, V. Mikuni, W. Bhimji, and B. Nachman, OmniMol: Transferring Particle Physics Knowledge to Molecular Dynamics with Point-Edge Transformers, (2026), arXiv:2601.10791 [physics.chem-ph]

work page internal anchor Pith review Pith/arXiv arXiv 2026
[76]

Vincent, A connection between score matching and denoising autoencoders, Neural computation23, 1661 (2011)

P. Vincent, A connection between score matching and denoising autoencoders, Neural computation23, 1661 (2011)

work page 2011
[77]

L. N. Smith and N. Topin, Super-convergence: Very fast training of neural networks using large learning rates, inArtifi- cial Intelligence and Machine Learning for Multi-Domain Operations Applications, Vol. 11006 (SPIE, 2019) pp. 369–386, arXiv:1708.07120 [cs.LG]

work page Pith review arXiv 2019