pith. machine review for the scientific record. sign in

arxiv: 2604.02415 · v2 · submitted 2026-04-02 · ✦ hep-ph · cs.AI

Recognition: 3 theorem links

· Lean Theorem

Generative models on phase space

Authors on Pith no claims yet

Pith reviewed 2026-05-13 20:49 UTC · model grok-4.3

classification ✦ hep-ph cs.AI
keywords generative modelsdiffusion modelsphase spaceparticle physicsLorentz invariancehigh-energy physicsjet datamachine learning
0
0 comments X

The pith

Generative models for particle physics data stay exactly on the physical phase space manifold at every sampling step.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces diffusion and flow matching models that generate collections of particle four-momenta while remaining confined by construction to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame. This exact confinement enforces energy and momentum conservation throughout the entire generative trajectory rather than learning the constraints only approximately. For diffusion models the forward process ends at the uniform distribution over this manifold, providing a clear baseline from which particle correlations develop in the reverse process. The authors show that the models can learn both few-particle and many-particle distributions containing various singularity structures.

Core claim

Generative models can be constructed so that every step of the sampling trajectory lies on the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame, thereby satisfying physical priors such as energy and momentum conservation exactly rather than approximately.

What carries the argument

The manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame, which acts as the exact constraint surface that the generative process never leaves.

If this is right

  • Diffusion models begin the reverse process from the uniform distribution on the phase space manifold.
  • The models reproduce distributions for both small and large numbers of particles that include multiple singularity structures.
  • Exact constraint satisfaction improves reliability and interpretability of generated events compared with models that learn constraints only approximately.
  • The approach supports future interpretability studies on simulated jet data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The uniform starting distribution on phase space could serve as a reference point for measuring how physical structures emerge in other constrained generative tasks.
  • Exact manifold confinement might be adapted to other domains that require strict conservation laws, such as molecular conformation sampling.
  • The method could reduce post-generation corrections in Monte Carlo event generators by eliminating unphysical samples at the source.

Load-bearing premise

The exact manifold constraint can be maintained throughout training and sampling without preventing the model from accurately reproducing target distributions that contain various singularity structures.

What would settle it

Generating samples from the trained model and checking whether total four-momentum is exactly conserved while the distribution of pairwise angles or energies deviates from a known target jet distribution with collinear singularities.

Figures

Figures reproduced from arXiv: 2604.02415 by Andrew J. Larkoski, Ibrahim Elsharkawy, Noam Levi, Yonatan Kahn, Zachary Bogorad.

Figure 1
Figure 1. Figure 1: FIG. 1: ( [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2: Distributions of the logarithm of the theoretical Dalitz plot PDF for the true muon decay distribution and [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3: Energy distributions for the muon decay matrix element. [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4: Distributions of the Rosenblatt transformation parameters [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5: Angular distributions for the muon decay matrix element. [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6: Dalitz plots of 500,000 samples from the [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7: As in Fig. 1, but for the [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: FIG. 8: Single-particle energy distributions comparing diffusion model samples to the ground truth distribution of [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: FIG. 9: Single-particle angular distributions comparing diffusion model samples to the ground truth distribution of [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: FIG. 10: ( [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: FIG. 11: ( [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: FIG. 12: Distributions of [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: FIG. 13: Comparison of flow matching models in [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: FIG. 14: Per-event energy and momentum violation normalized by the median single-particle energy [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: FIG. 15: Energy (left) and angular (center and right) distributions of particle 3 for varying data augmentation [PITH_FULL_IMAGE:figures/full_fig_p019_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: FIG. 16: Training data in [PITH_FULL_IMAGE:figures/full_fig_p020_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: FIG. 17: Distributions of the logarithm of the theoretical Dalitz plot PDF for the true muon decay distribution and [PITH_FULL_IMAGE:figures/full_fig_p021_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: FIG. 18: Rosenblatt transformations as in Fig. 4, but for diffusion models trained with [PITH_FULL_IMAGE:figures/full_fig_p021_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: shows the beginning and end of the forward process for these three different schedules and datasets. The shorter schedule does not fully equilibrate in q-space, while the longer schedule (for both training set choices) is a [PITH_FULL_IMAGE:figures/full_fig_p022_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: FIG. 20: Reverse trajectory for the [PITH_FULL_IMAGE:figures/full_fig_p023_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: FIG. 21: Snapshots from the reverse process in [PITH_FULL_IMAGE:figures/full_fig_p023_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: FIG. 22: Forward process start and end distributions in [PITH_FULL_IMAGE:figures/full_fig_p024_22.png] view at source ↗
read the original abstract

Deep generative models such as diffusion and flow matching are powerful machine learning tools capable of learning and sampling from high-dimensional distributions. They are particularly useful when the training data appears to be concentrated on a submanifold of the data embedding space. For high-energy physics data, consisting of collections of relativistic energy-momentum 4-vectors, this submanifold can enforce extremely strong physically-motivated priors, such as energy and momentum conservation. If these constraints are learned only approximately, rather than exactly, this can inhibit the interpretability and reliability of such generative models. To remedy this deficiency, we introduce generative models which are, by construction, confined at every step of their sampling trajectory to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame. In the case of diffusion models, the "pure noise" forward process endpoint corresponds to the uniform distribution on phase space, which provides a clear starting point from which to identify how correlations among the particles emerge during the reverse (de-noising) process. We demonstrate that our models are able to learn both few-particle and many-particle distributions with various singularity structures, paving the way for future interpretability studies using generative models trained on simulated jet data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper introduces diffusion and flow-matching generative models that are confined by construction to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame. The forward process terminates at the uniform measure on this manifold, and the models are claimed to learn and sample from target distributions containing soft and collinear singularities for both few- and many-particle cases.

Significance. If the central claim holds, the work provides a meaningful advance in constrained generative modeling for high-energy physics by enforcing exact physical priors (energy-momentum conservation and on-shell conditions) without learned approximations. The parameter-free geometric construction and the explicit uniform-phase-space starting point are clear strengths that could support future interpretability analyses of correlation emergence.

major comments (1)
  1. [Abstract] Abstract: the claim of successful demonstrations on distributions with various singularity structures is stated without quantitative metrics, error analysis, or comparisons of singular exponents between model and target, leaving the expressivity of the exactly constrained reverse process unverified.
minor comments (1)
  1. Notation for the phase-space manifold and the precise definition of the forward-process endpoint could be made more explicit to aid reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their positive evaluation of the work and for highlighting a useful point about the abstract. We address the comment below and will revise the manuscript to incorporate quantitative elements where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of successful demonstrations on distributions with various singularity structures is stated without quantitative metrics, error analysis, or comparisons of singular exponents between model and target, leaving the expressivity of the exactly constrained reverse process unverified.

    Authors: We agree that the abstract, as a high-level summary, does not include explicit numerical metrics. The main text already presents quantitative evidence for the model's performance, including direct comparisons of generated and target distributions for both few- and many-particle cases, error analyses on energy-momentum conservation residuals, and visual/quantitative assessments of how singular structures (soft and collinear) are reproduced. In the revised version we will update the abstract to reference these results more explicitly, for example by noting the level of agreement achieved on singular exponents and the scale of the error metrics shown in the figures. This change will better convey the expressivity of the exactly constrained reverse process without altering the technical claims. revision: yes

Circularity Check

0 steps flagged

Direct geometric construction using known physical priors; no circularity

full rationale

The paper's central claim is a direct geometric construction that enforces the massless N-particle Lorentz-invariant phase space manifold (center-of-momentum frame) exactly at every sampling step by construction, drawing on standard physical priors such as energy-momentum conservation and Lorentz invariance. This does not reduce to any fitted parameter, self-referential definition, or load-bearing self-citation chain; the forward process endpoint is the uniform measure on that manifold, and the reverse process is constrained to remain on it without deriving the constraint from the target distribution itself. No steps match the enumerated circularity patterns, and the approach remains self-contained against external benchmarks like known phase-space measures.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard physical definition of massless N-particle phase space in the center-of-momentum frame; no free parameters, new entities, or ad-hoc axioms are introduced beyond domain-standard assumptions.

axioms (1)
  • domain assumption The phase space of massless N-particle systems in the center-of-momentum frame is a well-defined manifold on which uniform distributions and sampling trajectories can be defined.
    This is the geometric prior used to confine the generative process at every step.

pith-pipeline@v0.9.0 · 5518 in / 1209 out tokens · 48339 ms · 2026-05-13T20:49:42.540639+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    generative models which are, by construction, confined at every step of their sampling trajectory to the manifold of massless N-particle Lorentz-invariant phase space in the center-of-momentum frame

  • IndisputableMonolith/Cost Jcost_unit0 echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    the 'pure noise' forward process endpoint corresponds to the uniform distribution on phase space

  • IndisputableMonolith/Foundation/reality_from_one_distinction reality_from_one_distinction echoes
    ?
    echoes

    ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

    RAMBO algorithm ... trades the constraints of phase space for a non-uniform distribution in an auxiliary space, q-space

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

77 extracted references · 77 canonical work pages · 10 internal anchors

  1. [1]

    Start from a pointP 0 in phase space, and map it to a pointQ 0(P0,b, x) inq-space as described in Sec. II C. There are many possibilities for choosing this transformation (b, x), which we describe further below

  2. [2]

    copies” the phase space submanifoldN mult times intoq-space; 7 •using adifferent(b, x) for each pointPto continously “fill out

    Implement the Langevin dynamics inq-space, Qt+1 =Q t +γ t∇logp ref(Qt) + p 2γtZt, t= 0,1, . . . T,(10) whereZ t ∼ N(0, I 3N×3N ) is isotropic Gaussian noise inR 3N,γ t is a fixed noise schedule, and∇logp ref is given in Eq. (9). At any timestept,RAMBOgives a unique, well-defined mappingQ t →P t whereP t lives onN-particle phase space, and thus we can map ...

  3. [3]

    Diffusion time is encoded through sinusoidal embeddings with dimension 64, appended to the 9-dimensional input for 3-particleq-space

    3-particle distributions Our diffusion model score network consists of a 4-layer multilayer perceptron (MLP) withSiLUactivations and hidden width 256, using defaultPyTorchinitializations for the inner layers but initializing the last-layer parameters to zero. Diffusion time is encoded through sinusoidal embeddings with dimension 64, appended to the 9-dime...

  4. [4]

    interpretingp-space vectors directly asq-space vectors)

    no augmentation or transformation (i.e. interpretingp-space vectors directly asq-space vectors)

  5. [5]

    a 2-fold augmentationN mult = 2

  6. [6]

    16: Training data inq-space for the different data augmentation strategies in Fig

    a larger augmentationN mult = 10; 20 0 1 2 3 4 5 6 7 8 q3 10−2 10−1 100 Density Muon decay, q-space distribution Case 1 Case 2 Case 3 Case 4 main text pref(Q) FIG. 16: Training data inq-space for the different data augmentation strategies in Fig. 15. ObservableN mult = 1N mult = 2 E1 0.73×10 −3 2.4×10 −3 E2 0.35×10 −3 1.2×10 −3 E3 1.6×10 −3 5.7×10 −3 ln p...

  7. [7]

    FID inv” denotes a Fr´ echet distance computed on Lorentz-invariant features, while “FIDAE

    a continuous transformation where eachp-space point is boosted and rescaled by adifferentconformal trans- formation (b, x). All models were trained with the same totalq-space training set size of 500,000, with all other hyperparameters the same as described above. The energy distributions are clearly a poor match to the target, with all but the default st...

  8. [8]

    Higher-dimensional distributions a. Network.For distributions withN≥3 particles, we replace the MLP score network with aPoint Edge Transformer(PET) as described in [25] due to the architecture’s track record for jet tasks and its ability to generalize beyond what it was originally designed for [66, 67]. The PET is a transformer that treats eachq-space eve...

  9. [9]

    HEP ML Community, A Living Review of Machine Learning for Particle Physics,https://iml-wg.github.io/ HEPML-LivingReview/

  10. [10]

    Fefferman, S

    C. Fefferman, S. Mitter, and H. Narayanan, Testing the manifold hypothesis, Journal of the American Mathematical Society29, 983 (2013), arXiv:1310.0425

  11. [11]

    Unsupervised feature learning and deep learning: A review and new perspectives.CoRR, abs/1206.5538, 1(2665):2012, 2012

    Y. Bengio, A. Courville, and P. Vincent, Representation learning: A review and new perspectives, IEEE transactions on pattern analysis and machine intelligence35, 1798 (2014), arXiv:1206.5538 [cs.LG]

  12. [12]

    P. P. Brahma, D. O. Wu, and Y. She, Why deep learning works: A manifold disentanglement perspective, IEEE Transac- tions on Neural Networks and Learning Systems27, 1997 (2016)

  13. [13]

    Deep Unsupervised Learning using Nonequilibrium Thermodynamics

    J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli, Deep Unsupervised Learning using Nonequilibrium Thermodynamics, ICML (2015), arXiv:1503.03585 [cs.LG]

  14. [14]

    Generative modeling by estimating gradients of the data distribu- tion.arXiv preprint arXiv:1907.05600,

    Y. Song and S. Ermon, Generative Modeling by Estimating Gradients of the Data Distribution, NeurIPS (2019), arXiv:1907.05600 [cs.LG]

  15. [15]

    J. Ho, A. Jain, and P. Abbeel, Denoising Diffusion Probabilistic Models, NeurIPS (2020), arXiv:2006.11239 [cs.LG]

  16. [16]

    Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-Based Generative Modeling through Stochastic Differential Equations, ICLR (2021), arXiv:2011.13456 [cs.LG]

  17. [17]

    Flow Matching for Generative Modeling

    Y. Lipman, R. T. Q. Chen, H. Ben-Hamu, and M. Nickel, Flow Matching for Generative Modeling, ICLR (2023), arXiv:2210.02747 [cs.LG]

  18. [18]

    Flow Matching Guide and Code

    Y. Lipman, M. Havasi, P. Holderrieth, N. Shaul, M. Le, B. Karrer, R. T. Q. Chen, D. Lopez-Paz, H. Ben-Hamu, and I. Gat, Flow matching guide and code, (2024), arXiv:2412.06264 [cs.LG]

  19. [19]

    A. Tong, K. Fatras, N. Malkin, G. Huguet, Y. Zhang, J. Rector-Brooks, G. Wolf, and Y. Bengio, Improving and generalizing flow-based generative models with minibatch optimal transport, Transactions on Machine Learning Research (2024), arXiv:2302.00482 [cs.LG]

  20. [20]

    C.-H. Lai, Y. Song, D. Kim, Y. Mitsufuji, and S. Ermon, The principles of diffusion models, (2025), arXiv:2510.21890 [cs.LG]

  21. [21]

    Leigh, D

    M. Leigh, D. Sengupta, G. Qu´ etant, J. A. Raine, K. Zoch, and T. Golling, PC-JeDi: Diffusion for particle cloud generation in high energy physics, SciPost Phys.16, 018 (2024), arXiv:2303.05376 [hep-ph]

  22. [22]

    Mikuni, B

    V. Mikuni, B. Nachman, and M. Pettee, Fast point cloud generation with diffusion models in high energy physics, Phys. Rev. D108, 036025 (2023), arXiv:2304.01266 [hep-ph]

  23. [23]

    Butter, N

    A. Butter, N. Huetsch, S. P. Schweitzer, T. Plehn, P. Sorrenson, and J. Spinner, Jet Diffusion versus JetGPT – Modern Networks for the LHC, SciPost Phys.Core8, 026 (2023), arXiv:2305.10475 [hep-ph]

  24. [24]

    Leigh, D

    M. Leigh, D. Sengupta, J. A. Raine, G. Qu´ etant, and T. Golling, PC-Droid: Faster diffusion and improved quality for particle cloud generation, Phys.Rev.D109, 012010 (2023), arXiv:2307.06836 [hep-ex]

  25. [25]

    Buhmann, C

    E. Buhmann, C. Ewen, D. A. Faroughy, T. Golling, G. Kasieczka, M. Leigh, G. Qu´ etant, J. A. Raine, D. Sengupta, and D. Shih, EPiC-ly Fast Particle Cloud Generation with Flow-Matching and Diffusion, (2023), arXiv:2310.00049 [hep-ph]. 27

  26. [26]

    Butter, T

    A. Butter, T. Jezo, M. Klasen, M. Kuschick, S. Palacios Schweitzer, and T. Plehn, Kicking it off(-shell) with direct diffusion, SciPost Phys. Core7, 064 (2024), arXiv:2311.17175 [hep-ph]

  27. [27]

    Sengupta, M

    D. Sengupta, M. Leigh, J. A. Raine, S. Klein, and T. Golling, Improving new physics searches with diffusion models for event observables and jet constituents, JHEP04, 109, arXiv:2312.10130 [physics.data-an]

  28. [28]

    Qu´ etant, J

    G. Qu´ etant, J. A. Raine, M. Leigh, D. Sengupta, and T. Golling, Generating variable length full events from partons, Phys. Rev. D110, 076023 (2024), arXiv:2406.13074 [hep-ph]

  29. [29]

    Jiang, S

    C. Jiang, S. Qian, and H. Qu, Choose your diffusion: Efficient and flexible ways to accelerate the diffusion model in fast high energy physics simulation, SciPost Phys.18, 195 (2025), arXiv:2401.13162 [physics.ins-det]

  30. [30]

    Mikuni and B

    V. Mikuni and B. Nachman, Method to simultaneously facilitate all jet physics tasks, Phys. Rev. D111, 054015 (2025), arXiv:2502.14652 [hep-ph]

  31. [31]

    Dreyer, E

    E. Dreyer, E. Gross, D. Kobylianskii, V. Mikuni, and B. Nachman, Conditional deep generative models for simultaneous simulation and reconstruction of entire events, Phys. Rev. D113, 032005 (2026), arXiv:2503.19981 [hep-ex]

  32. [32]

    D. A. Faroughy, M. Opper, and C. Ojeda, Multimodal Generative Flows for LHC Jets, in39th Annual Conference on Neural Information Processing Systems: Includes Machine Learning and the Physical Sciences (ML4PS)(2025) arXiv:2509.01736 [hep-ph]

  33. [33]

    Bhimji, C

    W. Bhimji, C. Harris, V. Mikuni, and B. Nachman, OmniLearned: A Multi-Task Foundation Model for Jet Physics, (2025), arXiv:2510.24066 [hep-ph]

  34. [34]

    S. Gong, Q. Meng, J. Zhang, H. Qu, C. Li, S. Qian, W. Du, Z.-M. Ma, and T.-Y. Liu, An efficient Lorentz equivariant graph neural network for jet tagging, JHEP07, 030, arXiv:2201.08187 [hep-ph]

  35. [35]

    Bogatskiy, T

    A. Bogatskiy, T. Hoffman, D. W. Miller, and J. T. Offermann, PELICAN: Permutation Equivariant and Lorentz Invariant or Covariant Aggregator Network for Particle Physics, (2022), arXiv:2211.00454 [hep-ph]

  36. [36]

    Z. Hao, R. Kansal, J. Duarte, and N. Chernyavskaya, Lorentz group equivariant autoencoders, Eur. Phys. J. C83, 485 (2023), arXiv:2212.07347 [hep-ex]

  37. [37]

    Bogatskiy, T

    A. Bogatskiy, T. Hoffman, D. W. Miller, J. T. Offermann, and X. Liu, Explainable equivariant neural networks for particle physics: PELICAN, JHEP03, 113, arXiv:2307.16506 [hep-ph]

  38. [38]

    Spinner, V

    J. Spinner, V. Bres´ o, P. de Haan, T. Plehn, J. Thaler, and J. Brehmer, Lorentz-Equivariant Geometric Algebra Trans- formers for High-Energy Physics, in38th conference on Neural Information Processing Systems(2024) arXiv:2405.14806 [physics.data-an]

  39. [39]

    Spinner, L

    J. Spinner, L. Favaro, P. Lippmann, S. Pitz, G. Gerhartz, T. Plehn, and F. A. Hamprecht, Lorentz Local Canonicalization: How to Make Any Network Lorentz-Equivariant, (2025), arXiv:2505.20280 [stat.ML]

  40. [40]

    Hebbar, T

    P. Hebbar, T. Madula, V. Mikuni, B. Nachman, N. Outmezguine, and I. Savoray, SEAL - A Symmetry EncourAging Loss for High Energy Physics, (2025), arXiv:2511.01982 [hep-ph]

  41. [41]

    Y. Xu, Z. Liu, M. Tegmark, and T. Jaakkola, Poisson flow generative models, Advances in Neural Information Processing Systems35, 16782 (2022), arXiv:2209.11178 [cs.LG]

  42. [42]

    Y. Xu, Z. Liu, Y. Tian, S. Tong, M. Tegmark, and T. Jaakkola, PFGM++: Unlocking the potential of physics-inspired generative models, inInternational Conference on Machine Learning(PMLR, 2023) pp. 38566–38591, arXiv:2302.04265 [cs.LG]

  43. [43]

    De Bortoli, E

    V. De Bortoli, E. Mathieu, M. Hutchinson, J. Thornton, Y. Whye Teh, and A. Doucet, Riemannian Score-Based Generative Modelling, Advances in neural information processing systems35, 2406 (2022), arXiv:2202.02763 [cs.LG]

  44. [44]

    Diffusion Processes on Implicit Manifolds

    V. Kawasaki-Borruat, C. Grotehans, P. Vandergheynst, and A. Gosztolai, Diffusion processes on implicit manifolds, (2026), arXiv:2604.07213 [cs.LG]

  45. [45]

    Kleiss, W

    R. Kleiss, W. J. Stirling, and S. D. Ellis, A New Monte Carlo Treatment of Multiparticle Phase Space at High-energies, Comput. Phys. Commun.40, 359 (1986)

  46. [46]

    Nachman and R

    B. Nachman and R. Winterhalder, Elsa: enhanced latent spaces for improved collider simulations, Eur. Phys. J. C83, 843 (2023), arXiv:2305.07696 [hep-ph]

  47. [47]

    Heimel, O

    T. Heimel, O. Mattelaer, T. Plehn, and R. Winterhalder, Differentiable MadNIS-Lite, SciPost Phys.18, 017 (2025), arXiv:2408.01486 [hep-ph]

  48. [48]

    P. T. Komiske, E. M. Metodiev, and J. Thaler, The Hidden Geometry of Particle Collisions, JHEP07, 006, arXiv:2004.04159 [hep-ph]

  49. [49]

    Hsieh, A

    Y.-P. Hsieh, A. Kavis, P. Rolland, and V. Cevher, Mirrored langevin dynamics, Advances in Neural Information Processing Systems31(2018), arXiv:1802.10174 [cs.LG]

  50. [50]

    Hyv¨ arinen and P

    A. Hyv¨ arinen and P. Dayan, Estimation of non-normalized statistical models by score matching., Journal of Machine Learning Research6(2005)

  51. [51]

    Z. Wang, P. Wang, K. Liu, P. Wang, Y. Fu, C.-T. Lu, C. C. Aggarwal, J. Pei, and Y. Zhou, A comprehensive survey on data augmentation, (2025), arXiv:2405.09591 [cs.LG]

  52. [52]

    Batson, C

    J. Batson, C. G. Haaf, Y. Kahn, and D. A. Roberts, Topological Obstructions to Autoencoding, JHEP04, 280, arXiv:2102.08380 [hep-ph]

  53. [53]

    Rosenblatt, Remarks on a Multivariate Transformation, Annals of Mathematical Statistics23, 470 (1952)

    M. Rosenblatt, Remarks on a Multivariate Transformation, Annals of Mathematical Statistics23, 470 (1952)

  54. [54]

    The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations

    J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao, T. Stelzer, P. Torrielli, and M. Zaro, The automated computation of tree-level and next-to-leading order differential cross sections, and their matching to parton shower simulations, JHEP07, 079, arXiv:1405.0301 [hep-ph]

  55. [55]

    Brandt, C

    S. Brandt, C. Peyrou, R. Sosnowski, and A. Wroblewski, The Principal axis of jets. An Attempt to analyze high-energy collisions as two-body processes, Phys. Lett.12, 57 (1964)

  56. [56]

    Farhi, A QCD Test for Jets, Phys

    E. Farhi, A QCD Test for Jets, Phys. Rev. Lett.39, 1587 (1977). 28

  57. [57]

    R. K. Ellis, W. J. Stirling, and B. R. Webber,QCD and collider physics, Vol. 8 (Cambridge University Press, 2011)

  58. [58]

    S. J. Parke and T. R. Taylor, An Amplitude fornGluon Scattering, Phys. Rev. Lett.56, 2459 (1986)

  59. [59]

    F. A. Berends and W. T. Giele, Recursive Calculations for Processes with n Gluons, Nucl. Phys. B306, 759 (1988)

  60. [60]

    P. D. Draggiotis, A. van Hameren, and R. Kleiss, SARGE: An Algorithm for generating QCD antennas, Phys. Lett. B 483, 124 (2000), arXiv:hep-ph/0004047

  61. [61]

    A. J. Larkoski and E. M. Metodiev, A Theory of Quark vs. Gluon Discrimination, JHEP10, 014, arXiv:1906.01639 [hep-ph]

  62. [62]

    Roscher, B

    R. Roscher, B. Bohn, M. F. Duarte, and J. Garcke, Explainable machine learning for scientific insights and discoveries, IEEE Access8, 42200 (2020), arXiv:1905.08883

  63. [63]

    Krenn, R

    M. Krenn, R. Pollice, S. Y. Guo, M. Aldeghi, A. Cervera-Lierta, P. Friederich, G. dos Passos Gomes, F. H¨ ase, A. Jinich, A. Nigam, Z. Yao, and A. Aspuru-Guzik, On scientific understanding with artificial intelligence, Nature Reviews Physics 4, 761 (2022), arXiv:2204.01467

  64. [64]

    Gambhir, M

    R. Gambhir, M. LeBlanc, and Y. Zhou, The Pareto Frontier of Resilient Jet Tagging, in39th Annual Conference on Neural Information Processing Systems: Includes Machine Learning and the Physical Sciences (ML4PS)(2025) arXiv:2509.19431 [hep-ph]

  65. [65]

    Cagnetta, L

    F. Cagnetta, L. Petrini, U. M. Tomasini, A. Favero, and M. Wyart, How deep neural networks learn compositional data: The random hierarchy model, Physical Review X14, 031001 (2024), arXiv:2307.02129

  66. [66]

    Sclocchi, A

    A. Sclocchi, A. Favero, and M. Wyart, A phase transition in diffusion models reveals the hierarchical nature of data, Proceedings of the National Academy of Sciences122, e2408799121 (2025), arXiv:2402.16991

  67. [67]

    Sclocchi, A

    A. Sclocchi, A. Favero, N. I. Levi, and M. Wyart, Probing the latent hierarchical structure of data via diffusion models, Journal of Statistical Mechanics: Theory and Experiment , 084005 (2025), arXiv:2410.13770

  68. [68]

    How compositional generalization and creativity improve as diffusion models are trained.arXiv preprint arXiv:2502.12089, 2025

    A. Favero, A. Sclocchi, F. Cagnetta, P. Frossard, and M. Wyart, How compositional generalization and creativity improve as diffusion models are trained, inProceedings of the 42nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 267 (PMLR, 2025) pp. 16286–16306, arXiv:2502.12089

  69. [69]

    Y. Han, A. Han, W. Huang, C. Lu, and D. Zou, Can diffusion models learn hidden inter-feature rules behind images?, in Proceedings of the 42nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 267 (PMLR, 2025) pp. 21704–21732, arXiv:2502.04725

  70. [70]

    N. I. Levi and Y. Oz, The underlying universal statistical structure of natural datasets, inForty-second International Conference on Machine Learning(2025) arXiv:2306.14975

  71. [71]

    Levi and Y

    N. Levi and Y. Oz, The universal statistical structure and scaling laws of chaos and turbulence, (2023), arXiv:2311.01358 [cond-mat.stat-mech]

  72. [72]

    P. T. Komiske, R. Mastandrea, E. M. Metodiev, P. Naik, and J. Thaler, Exploring the Space of Jets with CMS Open Data, Phys. Rev. D101, 034009 (2020), arXiv:1908.08542 [hep-ph]

  73. [73]

    Breso-Pla, K

    V. Breso-Pla, K. Greif, V. Mikuni, B. Nachman, T. Plehn, T. Wamorkar, and D. Whiteson, Explicit or Implicit? Encoding Physics at the Precision Frontier, (2026), arXiv:2603.08802 [hep-ph]

  74. [74]

    Mikuni, I

    V. Mikuni, I. Elsharkawy, and B. Nachman, OmniCosmos: Transferring Particle Physics Knowledge Across the Cosmos, (2025), arXiv:2512.24422 [astro-ph.CO]

  75. [75]

    OmniMol: Transferring Particle Physics Knowledge to Molecular Dynamics with Point-Edge Transformers

    I. Elsharkawy, V. Mikuni, W. Bhimji, and B. Nachman, OmniMol: Transferring Particle Physics Knowledge to Molecular Dynamics with Point-Edge Transformers, (2026), arXiv:2601.10791 [physics.chem-ph]

  76. [76]

    Vincent, A connection between score matching and denoising autoencoders, Neural computation23, 1661 (2011)

    P. Vincent, A connection between score matching and denoising autoencoders, Neural computation23, 1661 (2011)

  77. [77]

    L. N. Smith and N. Topin, Super-convergence: Very fast training of neural networks using large learning rates, inArtifi- cial Intelligence and Machine Learning for Multi-Domain Operations Applications, Vol. 11006 (SPIE, 2019) pp. 369–386, arXiv:1708.07120 [cs.LG]