Geodesics in the Deep Linear Network

Alan Chen

arxiv: 2510.07324 · v2 · submitted 2025-09-18 · 🧮 math.DG · cs.LG· math.DS

Geodesics in the Deep Linear Network

Alan Chen This is my paper

Pith reviewed 2026-05-18 16:45 UTC · model grok-4.3

classification 🧮 math.DG cs.LGmath.DS

keywords geodesicsdeep linear networksRiemannian submersionbalanced manifoldordinary differential equationsfull rank matricesdifferential geometry

0 comments

The pith

Geodesics between full rank matrices in deep linear networks follow from a system of ODEs with explicit solutions in special cases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives a general system of ordinary differential equations that describe geodesics connecting full rank matrices in the geometry of deep linear networks. It supplies explicit solutions for these equations in one special case. The central discovery is that certain straight lines inside an invariant balanced manifold are horizontal and continue to qualify as geodesics after the manifold is projected by Riemannian submersion. A reader would care because the result supplies concrete paths in a geometry that appears when linear networks are analyzed with differential-geometric tools.

Core claim

We derive a general system of ODEs and associated explicit solutions in a special case for geodesics between full rank matrices in the deep linear network geometry. In the process, we find horizontal straight lines in the invariant balanced manifold that remain geodesics under Riemannian submersion.

What carries the argument

The invariant balanced manifold together with its Riemannian submersion to the quotient, which preserves horizontal straight lines as geodesics.

If this is right

Geodesics are recovered by integrating the derived system of ODEs.
Special cases of the geodesics admit closed-form explicit expressions.
Horizontal straight lines inside the balanced manifold remain length-minimizing after projection.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same submersion technique might identify tractable geodesics in other quotient geometries that arise when analyzing neural-network parameter spaces.
If the geodesics align with training trajectories, they could furnish analytic benchmarks for convergence rates in linear-network optimization.
The ODE system offers a starting point for numerical algorithms that compute distances or shortest paths inside the deep linear network geometry.

Load-bearing premise

The deep linear network parameter space carries a Riemannian metric under which the balanced manifold is invariant and the projection to the quotient is a Riemannian submersion that preserves the horizontal geodesics.

What would settle it

A direct length calculation or numerical integration showing that a horizontal straight line in the balanced manifold is longer than some other curve between the same endpoints after submersion.

read the original abstract

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript derives a general system of ODEs governing geodesics between full-rank matrices on the parameter space of deep linear networks equipped with a Riemannian metric. It obtains explicit solutions in a special case and shows that the balanced manifold is invariant, with horizontal straight lines descending to geodesics under the associated Riemannian submersion, verified by direct computation of the horizontal lift and vanishing of the second fundamental form along those curves.

Significance. If the derivations hold, the work supplies explicit geodesic equations and solutions in this geometry, which is useful for analyzing optimization paths in deep linear networks. The strength lies in the direct, parameter-free verification of invariance and submersion properties via computation of the horizontal lift and second fundamental form, providing reproducible and falsifiable geometric claims without reliance on fitted parameters or post-hoc adjustments.

minor comments (3)

[§3] §3: The transition from the general ODE system to the special-case explicit solutions would benefit from an explicit statement of the parameter restrictions that reduce the system, to make the specialization step fully transparent.
[§4.2] §4.2: The verification that the second fundamental form vanishes could include a brief remark on the coordinate chart used for the computation, to aid readers in reproducing the direct calculation.
[Figure 2] Figure 2: The caption should reference the theorem establishing invariance of the balanced manifold to connect the illustration directly to the main result.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary and recommendation of minor revision. The referee's description accurately reflects the manuscript's contributions on the geodesic ODEs, explicit solutions, invariance of the balanced manifold, and verification via horizontal lifts and the second fundamental form.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper derives the general ODE system for geodesics directly from the Riemannian metric on the deep linear network parameter space, the invariance of the balanced manifold, and the Riemannian submersion property. Horizontal straight lines are shown to remain geodesics by explicit computation of the horizontal lift and verification that the second fundamental form vanishes along those curves. The special-case explicit solutions follow from solving the resulting ODEs under the stated assumptions. No steps reduce by construction to fitted inputs, self-definitions, or load-bearing self-citations; the argument relies on standard differential geometry applied to the given manifold structure and is independent of its target results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence of a Riemannian metric on the deep linear network space, invariance of the balanced manifold, and the submersion property; these are standard domain assumptions rather than new postulates.

axioms (1)

domain assumption The space of deep linear networks is equipped with a Riemannian metric making the balanced manifold invariant and the natural projection a Riemannian submersion.
Invoked to guarantee that horizontal straight lines remain geodesics after submersion.

pith-pipeline@v0.9.0 · 5548 in / 1234 out tokens · 33729 ms · 2026-05-18T16:45:54.271197+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 2.1 (DLN Geodesic Equations) and Theorem 2.3 (Explicit Formulas for DLN Geodesics) via Hamiltonian coordinates and horizontal straight lines on M
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Lemma 2.2 (Characterization of Straight Lines on M) and Riemannian submersion ϕ preserving horizontal geodesics

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

[1]

Sanjeev Arora, Nadav Cohen, Noah Golowich, and Wei Hu,A convergence analysis of gradient descent for deep linear neural networks, arXiv preprint arXiv:1810.02281 (2018)

work page arXiv 2018
[2]

Sanjeev Arora, Nadav Cohen, and Elad Hazan,On the optimization of deep networks: Implicit acceleration by overparameterization, International Conference on Machine Learning, PMLR, 2018, pp. 244–253

work page 2018
[3]

1, 307–353

Bubacarr Bah, Holger Rauhut, Ulrich Terstiege, and Michael Westdickenberg,Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers, Information and Inference: A Journal of the IMA11(2022), no. 1, 307–353

work page 2022
[4]

Rajendra Bhatia,Positive definite matrices, Princeton University press, 2009

work page 2009
[5]

2, 165–191

Rajendra Bhatia, Tanvi Jain, and Yongdo Lim,On the Bures–Wasserstein distance between positive definite matrices, Expositiones Mathematicae37(2019), no. 2, 165–191

work page 2019
[6]

4, 3208–3232

Nadav Cohen, Govind Menon, and Zsolt Veraszto,Deep linear networks for matrix comple- tion—an infinite depth limit, SIAM Journal on Applied Dynamical Systems22(2023), no. 4, 3208–3232

work page 2023
[7]

Hilbert and S

David R. Hilbert and S. Cohn-Vossen,Geometry and the Imagination, Chelsea Publishing, 1952

work page 1952
[8]

Math.114(1990), 51–75

Narendra Karmarkar,Riemannian geometry underlying interior-point methods for linear pro- gramming, Contemp. Math.114(1990), 51–75. 14 ALAN CHEN

work page 1990
[9]

Govind Menon,The geometry of the deep linear network, arXiv preprint arXiv:2411.09004 (2024)

work page arXiv 2024
[10]

Govind Menon and Tianmin Yu,An entropy formula for the deep linear network, 2025

work page 2025
[11]

AppendixA.Informal Note on Equation 2.11 The geodesic in Equation 2.11 also appears in another geometry [4] when the endpoints are positive definite diagonal matrices

Tianmin Yu,Riemannian Langevin equation and its applications in random matrix theory and Gibbs sampling problems, Doctoral dissertation, Brown University, 2024. AppendixA.Informal Note on Equation 2.11 The geodesic in Equation 2.11 also appears in another geometry [4] when the endpoints are positive definite diagonal matrices. In particular, we endowP d w...

work page 2024
[12]

Consider a variant ofA N,· with an additional 1 N factor in front so that aN→ ∞limit exists

Interestingly, gN asN→ ∞coincides with the trace metric under these assumptions. Consider a variant ofA N,· with an additional 1 N factor in front so that aN→ ∞limit exists. Then, indeed, ifPis diagonal and commutes withZ, (A.3)A N,P (Z) = 1 N NX p=1 (P 2) N−p N Z(P 2) p−1 N = (P 2) N−1 N Z. The limit asN→ ∞is justP 2Z. Thus,A −1 ∞,P =P −2Zand we conclude...

work page

[1] [1]

Sanjeev Arora, Nadav Cohen, Noah Golowich, and Wei Hu,A convergence analysis of gradient descent for deep linear neural networks, arXiv preprint arXiv:1810.02281 (2018)

work page arXiv 2018

[2] [2]

Sanjeev Arora, Nadav Cohen, and Elad Hazan,On the optimization of deep networks: Implicit acceleration by overparameterization, International Conference on Machine Learning, PMLR, 2018, pp. 244–253

work page 2018

[3] [3]

1, 307–353

Bubacarr Bah, Holger Rauhut, Ulrich Terstiege, and Michael Westdickenberg,Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers, Information and Inference: A Journal of the IMA11(2022), no. 1, 307–353

work page 2022

[4] [4]

Rajendra Bhatia,Positive definite matrices, Princeton University press, 2009

work page 2009

[5] [5]

2, 165–191

Rajendra Bhatia, Tanvi Jain, and Yongdo Lim,On the Bures–Wasserstein distance between positive definite matrices, Expositiones Mathematicae37(2019), no. 2, 165–191

work page 2019

[6] [6]

4, 3208–3232

Nadav Cohen, Govind Menon, and Zsolt Veraszto,Deep linear networks for matrix comple- tion—an infinite depth limit, SIAM Journal on Applied Dynamical Systems22(2023), no. 4, 3208–3232

work page 2023

[7] [7]

Hilbert and S

David R. Hilbert and S. Cohn-Vossen,Geometry and the Imagination, Chelsea Publishing, 1952

work page 1952

[8] [8]

Math.114(1990), 51–75

Narendra Karmarkar,Riemannian geometry underlying interior-point methods for linear pro- gramming, Contemp. Math.114(1990), 51–75. 14 ALAN CHEN

work page 1990

[9] [9]

Govind Menon,The geometry of the deep linear network, arXiv preprint arXiv:2411.09004 (2024)

work page arXiv 2024

[10] [10]

Govind Menon and Tianmin Yu,An entropy formula for the deep linear network, 2025

work page 2025

[11] [11]

AppendixA.Informal Note on Equation 2.11 The geodesic in Equation 2.11 also appears in another geometry [4] when the endpoints are positive definite diagonal matrices

Tianmin Yu,Riemannian Langevin equation and its applications in random matrix theory and Gibbs sampling problems, Doctoral dissertation, Brown University, 2024. AppendixA.Informal Note on Equation 2.11 The geodesic in Equation 2.11 also appears in another geometry [4] when the endpoints are positive definite diagonal matrices. In particular, we endowP d w...

work page 2024

[12] [12]

Consider a variant ofA N,· with an additional 1 N factor in front so that aN→ ∞limit exists

Interestingly, gN asN→ ∞coincides with the trace metric under these assumptions. Consider a variant ofA N,· with an additional 1 N factor in front so that aN→ ∞limit exists. Then, indeed, ifPis diagonal and commutes withZ, (A.3)A N,P (Z) = 1 N NX p=1 (P 2) N−p N Z(P 2) p−1 N = (P 2) N−1 N Z. The limit asN→ ∞is justP 2Z. Thus,A −1 ∞,P =P −2Zand we conclude...

work page