Geodesics in the Deep Linear Network
Pith reviewed 2026-05-18 16:45 UTC · model grok-4.3
The pith
Geodesics between full rank matrices in deep linear networks follow from a system of ODEs with explicit solutions in special cases.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We derive a general system of ODEs and associated explicit solutions in a special case for geodesics between full rank matrices in the deep linear network geometry. In the process, we find horizontal straight lines in the invariant balanced manifold that remain geodesics under Riemannian submersion.
What carries the argument
The invariant balanced manifold together with its Riemannian submersion to the quotient, which preserves horizontal straight lines as geodesics.
If this is right
- Geodesics are recovered by integrating the derived system of ODEs.
- Special cases of the geodesics admit closed-form explicit expressions.
- Horizontal straight lines inside the balanced manifold remain length-minimizing after projection.
Where Pith is reading between the lines
- The same submersion technique might identify tractable geodesics in other quotient geometries that arise when analyzing neural-network parameter spaces.
- If the geodesics align with training trajectories, they could furnish analytic benchmarks for convergence rates in linear-network optimization.
- The ODE system offers a starting point for numerical algorithms that compute distances or shortest paths inside the deep linear network geometry.
Load-bearing premise
The deep linear network parameter space carries a Riemannian metric under which the balanced manifold is invariant and the projection to the quotient is a Riemannian submersion that preserves the horizontal geodesics.
What would settle it
A direct length calculation or numerical integration showing that a horizontal straight line in the balanced manifold is longer than some other curve between the same endpoints after submersion.
read the original abstract
We derive a general system of ODEs and associated explicit solutions in a special case for geodesics between full rank matrices in the deep linear network geometry. In the process, we find horizontal straight lines in the invariant balanced manifold that remain geodesics under Riemannian submersion.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript derives a general system of ODEs governing geodesics between full-rank matrices on the parameter space of deep linear networks equipped with a Riemannian metric. It obtains explicit solutions in a special case and shows that the balanced manifold is invariant, with horizontal straight lines descending to geodesics under the associated Riemannian submersion, verified by direct computation of the horizontal lift and vanishing of the second fundamental form along those curves.
Significance. If the derivations hold, the work supplies explicit geodesic equations and solutions in this geometry, which is useful for analyzing optimization paths in deep linear networks. The strength lies in the direct, parameter-free verification of invariance and submersion properties via computation of the horizontal lift and second fundamental form, providing reproducible and falsifiable geometric claims without reliance on fitted parameters or post-hoc adjustments.
minor comments (3)
- [§3] §3: The transition from the general ODE system to the special-case explicit solutions would benefit from an explicit statement of the parameter restrictions that reduce the system, to make the specialization step fully transparent.
- [§4.2] §4.2: The verification that the second fundamental form vanishes could include a brief remark on the coordinate chart used for the computation, to aid readers in reproducing the direct calculation.
- [Figure 2] Figure 2: The caption should reference the theorem establishing invariance of the balanced manifold to connect the illustration directly to the main result.
Simulated Author's Rebuttal
We thank the referee for the positive summary and recommendation of minor revision. The referee's description accurately reflects the manuscript's contributions on the geodesic ODEs, explicit solutions, invariance of the balanced manifold, and verification via horizontal lifts and the second fundamental form.
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper derives the general ODE system for geodesics directly from the Riemannian metric on the deep linear network parameter space, the invariance of the balanced manifold, and the Riemannian submersion property. Horizontal straight lines are shown to remain geodesics by explicit computation of the horizontal lift and verification that the second fundamental form vanishes along those curves. The special-case explicit solutions follow from solving the resulting ODEs under the stated assumptions. No steps reduce by construction to fitted inputs, self-definitions, or load-bearing self-citations; the argument relies on standard differential geometry applied to the given manifold structure and is independent of its target results.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The space of deep linear networks is equipped with a Riemannian metric making the balanced manifold invariant and the natural projection a Riemannian submersion.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 2.1 (DLN Geodesic Equations) and Theorem 2.3 (Explicit Formulas for DLN Geodesics) via Hamiltonian coordinates and horizontal straight lines on M
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Lemma 2.2 (Characterization of Straight Lines on M) and Riemannian submersion ϕ preserving horizontal geodesics
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
- [1]
-
[2]
Sanjeev Arora, Nadav Cohen, and Elad Hazan,On the optimization of deep networks: Implicit acceleration by overparameterization, International Conference on Machine Learning, PMLR, 2018, pp. 244–253
work page 2018
-
[3]
Bubacarr Bah, Holger Rauhut, Ulrich Terstiege, and Michael Westdickenberg,Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers, Information and Inference: A Journal of the IMA11(2022), no. 1, 307–353
work page 2022
-
[4]
Rajendra Bhatia,Positive definite matrices, Princeton University press, 2009
work page 2009
-
[5]
Rajendra Bhatia, Tanvi Jain, and Yongdo Lim,On the Bures–Wasserstein distance between positive definite matrices, Expositiones Mathematicae37(2019), no. 2, 165–191
work page 2019
-
[6]
Nadav Cohen, Govind Menon, and Zsolt Veraszto,Deep linear networks for matrix comple- tion—an infinite depth limit, SIAM Journal on Applied Dynamical Systems22(2023), no. 4, 3208–3232
work page 2023
-
[7]
David R. Hilbert and S. Cohn-Vossen,Geometry and the Imagination, Chelsea Publishing, 1952
work page 1952
-
[8]
Narendra Karmarkar,Riemannian geometry underlying interior-point methods for linear pro- gramming, Contemp. Math.114(1990), 51–75. 14 ALAN CHEN
work page 1990
- [9]
-
[10]
Govind Menon and Tianmin Yu,An entropy formula for the deep linear network, 2025
work page 2025
-
[11]
Tianmin Yu,Riemannian Langevin equation and its applications in random matrix theory and Gibbs sampling problems, Doctoral dissertation, Brown University, 2024. AppendixA.Informal Note on Equation 2.11 The geodesic in Equation 2.11 also appears in another geometry [4] when the endpoints are positive definite diagonal matrices. In particular, we endowP d w...
work page 2024
-
[12]
Consider a variant ofA N,· with an additional 1 N factor in front so that aN→ ∞limit exists
Interestingly, gN asN→ ∞coincides with the trace metric under these assumptions. Consider a variant ofA N,· with an additional 1 N factor in front so that aN→ ∞limit exists. Then, indeed, ifPis diagonal and commutes withZ, (A.3)A N,P (Z) = 1 N NX p=1 (P 2) N−p N Z(P 2) p−1 N = (P 2) N−1 N Z. The limit asN→ ∞is justP 2Z. Thus,A −1 ∞,P =P −2Zand we conclude...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.