The Latent Information Geometry of Jet Classification
Pith reviewed 2026-05-15 16:33 UTC · model grok-4.3
The pith
Latent spaces of jet classification networks carry curvature and nonmetricities that reflect physical distinctions between quarks and gluons.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that learned latent geometries in jet classifiers can be analyzed using curvature and nonmetricities from information geometry, and demonstrate this by extracting physical insights from networks performing quark-gluon discrimination and fat jet tagging.
What carries the argument
Curvature and nonmetricities in the latent information geometry of decoder and classifier networks.
If this is right
- The geometry reveals which jet features drive the classification decisions in quark-gluon tagging.
- Nonmetricities highlight deviations from standard Riemannian structures in the learned representations.
- These tools can be used to compare different network architectures based on their induced geometries.
- Insights from the latent space can guide improvements in jet tagging performance.
Where Pith is reading between the lines
- This approach might extend to multi-class problems or other collider observables beyond jets.
- Correlations between curvature and specific kinematic variables could lead to new theoretical models.
- Testing on simulated data with known physics could validate the geometric interpretations.
Load-bearing premise
The assumption that the latent representations learned by the networks form a geometry that can be meaningfully described by information geometry tools yielding genuine physical insight.
What would settle it
Observing no correlation between the computed curvature in latent space and the known physical properties distinguishing quark from gluon jets would challenge the claim.
read the original abstract
Latent representations are an important theme in modern machine learning. Any network training with the notion of locality introduces a latent geometry which we can analyze with the help of differential geometry, specifically information geometry. We introduce the main concepts needed to analyze learned latent geometries, specifically curvature and nonmetricities, and show how they can be used for decoder and classifier geometries. We then apply our new methods to understand the physics behind binary quark-gluon classification and three-fold fat jet tagging.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces concepts from information geometry—specifically curvature and nonmetricities—to analyze latent representations induced by locality-aware neural networks. It shows how these quantities can be applied to decoder and classifier geometries and then uses them to extract physical insights from binary quark-gluon classification and three-class fat-jet tagging tasks.
Significance. If the geometric measures are shown to correlate robustly with established jet observables (energy flow, angular correlations, parton-shower variables), the framework could supply a principled, differential-geometric route to model interpretability in high-energy physics. This would be a genuine advance over post-hoc saliency methods, provided the mapping from latent geometry to physics is made quantitative and reproducible.
major comments (2)
- [§4.2] §4.2 (quark-gluon results): the reported curvature values are stated to distinguish quark from gluon jets, yet no baseline comparison to a linear classifier or to standard jet-shape observables (e.g., girth, jet mass) is provided; without this control it is unclear whether the geometric signal exceeds what is already captured by conventional discriminants.
- [§5] §5 (fat-jet tagging): the nonmetricity tensor is claimed to encode three-prong substructure, but the paper does not report a quantitative test (e.g., correlation with N-subjettiness ratios or a permutation test on the latent coordinates); the physical interpretation therefore rests on qualitative visualization alone.
minor comments (2)
- [§2] Notation for the Fisher metric and its pull-back to the latent space is introduced without an explicit equation reference; adding a short appendix deriving the discrete estimator from network activations would improve reproducibility.
- [Figure 3] Figure 3 (latent-space embeddings): axis labels and color scales are missing; the reader cannot judge the dynamic range of the plotted curvature.
Simulated Author's Rebuttal
We are grateful to the referee for their insightful comments, which have helped us identify areas where the manuscript can be improved. We address the major comments point by point below.
read point-by-point responses
-
Referee: [§4.2] §4.2 (quark-gluon results): the reported curvature values are stated to distinguish quark from gluon jets, yet no baseline comparison to a linear classifier or to standard jet-shape observables (e.g., girth, jet mass) is provided; without this control it is unclear whether the geometric signal exceeds what is already captured by conventional discriminants.
Authors: We agree that baseline comparisons would strengthen the results. In the revised manuscript, we will add a comparison of the curvature discrimination to that of a linear classifier trained on the same latent features, as well as to standard jet-shape observables such as girth and jet mass. This will clarify the extent to which the geometric signal provides information beyond conventional discriminants. revision: yes
-
Referee: [§5] §5 (fat-jet tagging): the nonmetricity tensor is claimed to encode three-prong substructure, but the paper does not report a quantitative test (e.g., correlation with N-subjettiness ratios or a permutation test on the latent coordinates); the physical interpretation therefore rests on qualitative visualization alone.
Authors: We acknowledge that the physical interpretation would benefit from quantitative validation. In the revision, we will report correlations between the nonmetricity tensor components and N-subjettiness ratios, along with a permutation test on the latent coordinates to assess the significance of the observed three-prong substructure encoding. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces standard information-geometry tools (curvature, nonmetricities) to analyze latent representations induced by locality-aware networks for jet classification tasks. No load-bearing step reduces a claimed prediction or uniqueness result to a fitted parameter, self-citation chain, or definitional tautology within the same data; the geometric quantities are imported from established differential geometry rather than constructed from the network outputs themselves. The central claim therefore remains independent of its own fitted values and does not exhibit any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce the main concepts needed to analyze learned latent geometries, specifically curvature and nonmetricities, and show how they can be used for decoder and classifier geometries.
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanalpha_pin_under_high_calibration unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The nonmetricity tensor for the (±1)-connection is the ACT... C1 = C_ijk C_ijk, C2 = τ^i τ_i, C3 = C̃_ijk C̃^ijk, C4 = (C_ijk C^ijk − τ^i τ_i)/4
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.