pith. sign in

arxiv: 2509.16113 · v2 · submitted 2025-09-19 · 🧮 math.OC

A generalized canonical metric for optimization on the indefinite Stiefel manifold

Pith reviewed 2026-05-18 15:24 UTC · model grok-4.3

classification 🧮 math.OC
keywords indefinite Stiefel manifoldRiemannian optimizationgeneralized canonical metricquasi-geodesicretractionRiemannian gradient descent
0
0 comments X

The pith

A generalized canonical metric simplifies Riemannian optimization on the indefinite Stiefel manifold.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a new Riemannian metric, termed the generalized canonical metric, for the indefinite Stiefel manifold. This metric allows computation of the Riemannian gradient without repeatedly solving Lyapunov matrix equations, unlike the previous family of metrics. The authors also derive a quasi-geodesic curve and a retraction based on it to support manifold optimization algorithms such as Riemannian gradient descent. Numerical tests show the new method performs competitively with lower computational cost per step.

Core claim

The central claim is that the generalized canonical metric equips the indefinite Stiefel manifold with a Riemannian structure in which the gradient of the objective function admits a closed-form expression that avoids the computational burden of solving a Lyapunov equation at each iteration, while still permitting the construction of a quasi-geodesic and an associated retraction that can be used to extend Euclidean optimization methods to the manifold.

What carries the argument

The generalized canonical metric, a Riemannian metric on the indefinite Stiefel manifold chosen so that the orthogonal projection and gradient formula become simpler and cheaper to evaluate.

If this is right

  • The Riemannian gradient descent algorithm on this manifold becomes cheaper to run because each gradient step avoids a matrix equation solve.
  • Quasi-geodesics provide a new way to move along the manifold that respects the new geometry.
  • Retractions derived from the quasi-geodesic enable practical implementation of the optimization method.
  • Performance comparisons indicate the new approach is at least as effective as the prior Lyapunov-based method.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the metric generalizes well, it could reduce costs for larger-scale problems in scientific computing modeled on this manifold.
  • The construction might suggest similar canonical metrics for related matrix manifolds with indefinite constraints.
  • Testable extension: apply the method to specific problems like orthogonal Procrustes with indefinite signatures and measure wall-clock time savings.

Load-bearing premise

That the proposed generalized canonical metric is positive definite and compatible with the manifold structure so that it truly defines a Riemannian metric.

What would settle it

A numerical example or algebraic counterexample in which the new metric fails to produce a positive definite inner product on the tangent space or in which the resulting optimization algorithm diverges while the previous method converges.

read the original abstract

Various tasks in scientific computing can be modeled as an optimization problem on the indefinite Stiefel manifold. We address this using the Riemannian approach, which basically consists of equipping the feasible set with a Riemannian metric, preparing geometric tools such as orthogonal projections, formulae for Riemannian gradient, retraction and then extending an unconstrained optimization algorithm on the Euclidean space to the established manifold. The choice for the metric undoubtedly has a great influence on the method. In the previous work [D.V. Tiep and N.T. Son, A Riemannian gradient descent method for optimization on the indefinite Stiefel manifold, arXiv:2410.22068v2[math.OC]], a tractable metric, which is indeed a family of Riemannian metrics defined by a symmetric positive-definite matrix depending on the contact point, has been used. In general, it requires solving a Lyapunov matrix equation every time when the gradient of the cost function is needed, which might significantly contribute to the computational cost. To address this issue, we propose a new Riemannian metric for the indefinite Stiefel manifold. Furthermore, we construct the associated geometric structure, including a so-called quasi-geodesic and propose a retraction based on this curve. We then numerically verify the performance of the Riemannian gradient descent method associated with the new geometry and compare it with the previous work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a generalized canonical Riemannian metric on the indefinite Stiefel manifold as a computationally cheaper alternative to the Lyapunov-equation-based family from the authors' prior work. It develops the associated geometric objects, including a quasi-geodesic and a retraction based on this curve, derives the Riemannian gradient under the new metric, and numerically compares the performance of the resulting Riemannian gradient descent algorithm.

Significance. If the new metric defines a valid positive-definite structure and the derived tools are correct, the work could reduce per-iteration cost in Riemannian optimization on this manifold while preserving convergence behavior. The numerical experiments provide preliminary evidence of practical gains. The explicit construction of the quasi-geodesic adds a useful geometric contribution, but the absence of a definiteness verification limits the strength of the central claim.

major comments (1)
  1. [Metric definition section] The section defining the generalized canonical metric does not establish positive definiteness of g_X(ξ,η) for nonzero tangent vectors ξ. The manuscript derives the cheaper Riemannian gradient formula under the assumption that the metric is valid, yet supplies no eigenvalue analysis, lower bound, or explicit check (in contrast to the prior Lyapunov construction that enforced this via a positive-definite matrix solution). This property is load-bearing for the gradient, quasi-geodesic, and retraction.
minor comments (2)
  1. [Abstract and Introduction] The abstract and introduction refer to the previous Lyapunov-based family but could include one sentence recalling its explicit positive-definiteness guarantee for context.
  2. [Metric definition section] Notation for the operator A(X) in the metric definition should be cross-referenced to any earlier equation that defines it.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for highlighting the need to verify positive definiteness of the proposed metric. This is a substantive point that strengthens the foundation of the work, and we address it directly below.

read point-by-point responses
  1. Referee: [Metric definition section] The section defining the generalized canonical metric does not establish positive definiteness of g_X(ξ,η) for nonzero tangent vectors ξ. The manuscript derives the cheaper Riemannian gradient formula under the assumption that the metric is valid, yet supplies no eigenvalue analysis, lower bound, or explicit check (in contrast to the prior Lyapunov construction that enforced this via a positive-definite matrix solution). This property is load-bearing for the gradient, quasi-geodesic, and retraction.

    Authors: We agree that the original manuscript does not contain an explicit proof or eigenvalue bound establishing positive definiteness of the generalized canonical metric, in contrast to the Lyapunov-based construction in our prior work. This is a valid observation. In the revised version we will add a dedicated paragraph (or short lemma) in the metric definition section that proves g_X(ξ,ξ) ≥ c‖ξ‖_F² for some c>0 and all nonzero tangent vectors ξ. The argument proceeds by writing the metric as a perturbation of the standard Frobenius inner product whose perturbation term is controlled by the signature matrix and the defining relation XᵀJX = I_p; we then obtain a uniform lower bound on the eigenvalues of the associated self-adjoint operator. A brief numerical verification for small (p,n) will also be included for illustration. With this addition the derivations of the Riemannian gradient, quasi-geodesic, and retraction rest on a rigorously verified Riemannian structure. revision: yes

Circularity Check

1 steps flagged

Minor self-citation to authors' prior Lyapunov metric work; new generalized canonical metric presented as independent proposal.

specific steps
  1. self citation load bearing [Abstract]
    "In the previous work [D.V. Tiep and N.T. Son, A Riemannian gradient descent method for optimization on the indefinite Stiefel manifold, arXiv:2410.22068v2[math.OC]], a tractable metric, which is indeed a family of Riemannian metrics defined by a symmetric positive-definite matrix depending on the contact point, has been used. In general, it requires solving a Lyapunov matrix equation every time when the gradient of the cost function is needed, which might significantly contribute to the computational cost. To address this issue, we propose a new Riemannian metric for the indefinite Stiefel man"

    The citation is to prior work by overlapping authors (Tiep and Son) describing the old metric's cost. However, because the current paper explicitly proposes a distinct new metric to replace it, the self-citation functions only as background motivation and does not make the new metric or its gradient formula equivalent to the prior construction by definition.

full rationale

The paper cites its own earlier arXiv:2410.22068 for the computational drawback of the Lyapunov-based family but introduces the new metric and associated quasi-geodesic/retraction as a fresh construction. No equation reduces the claimed cheaper gradient or validity to a fitted parameter or assumption taken from the cited paper. The derivation chain for the new geometry stands on its own definitions and is not forced by self-citation. This is a normal minor self-reference that does not affect the central independent content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that a pointwise symmetric positive-definite matrix can be chosen to define a Riemannian metric whose gradient formula avoids Lyapunov solves; no free parameters, additional axioms, or invented entities are stated in the abstract.

axioms (1)
  • domain assumption A family of Riemannian metrics on the indefinite Stiefel manifold can be defined by a symmetric positive-definite matrix that depends on the base point.
    This is the modeling choice that replaces the earlier Lyapunov-based metric.

pith-pipeline@v0.9.0 · 5778 in / 1258 out tokens · 41759 ms · 2026-05-18T15:24:32.243069+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages · 1 internal anchor

  1. [1]

    Princeton University Press, Princeton, NJ (2008)

    Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Mani- folds. Princeton University Press, Princeton, NJ (2008)

  2. [2]

    Adler, R., Dedieu, J.P., Margulies, J., Martens, M., Shub, M.: Newton’s method on Riemannian manifolds and a geometric model for human spine. IMA J. Numer. Anal., 22, 359–390 (2002)

  3. [3]

    Bartels, R., Stewart, G.: Solution of the equationAX+XB=C. Comm. ACM, 15(9), 820–826 (1972)

  4. [4]

    Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA J. Numer. Anal.,8(1), 141–148 (1988)

  5. [5]

    Bendokat, T., Zimmermann, R.: The real symplectic Stiefel and Grassmann mani- folds: metrics, geodesics and applications, https://arxiv.org/abs/2108.12447 (2021)

  6. [6]

    Cambridge University Press (2023)

    Boumal, N.: An Introduction to Optimization on Smooth Manifolds. Cambridge University Press (2023)

  7. [7]

    Boumal, N., Mishra, B., Absil, P.A., Sepulchre, R.: Manopt, a Matlab toolbox for optimization on manifolds. J. Mach. Learn. Res.,15, 1455–1459 (2014)

  8. [8]

    Springer Berlin, Heidelberg (1977)

    Deimling, K.: Ordinary Differential Equations in Banach Spaces. Springer Berlin, Heidelberg (1977)

  9. [9]

    Edelman, A., Arias, T., Smith, S.: The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl.,20(2), 303–353 (1998)

  10. [10]

    Gabay, D.: Minimizing a differentiable function over a differential manifold. J. Optim. Theory Appl.,37(2), 177–219 (1982)

  11. [11]

    Gao, B., Son, N.T., Absil, P.A., Stykel, T.: Geometry of the symplectic Stiefel man- ifold endowed with the Euclidean metric. In: F. Nielsen, F. Barbaresco (eds.) Ge- ometric Science of Information: GSI 2021,Lecture Notes in Computer Science, vol. 12829, 789–796. Springer Nature, Cham, Switzerland (2021)

  12. [12]

    Gao, B., Son, N.T., Absil, P.A., Stykel, T.: Riemannian optimization on the sym- plectic Stiefel manifold. SIAM J. Optim.,31(2), 1546–1575 (2021)

  13. [13]

    Gao, B., Son, N.T., Stykel, T.: Symplectic Stiefel manifold: tractable metrics, second- order geometry and Newton’s methods, https://arxiv.org/abs/2406.14299 (2024)

  14. [14]

    G¨ uttel, S., Nakatsukasa, Y.: Scaled and squared subdiagonal Pad´ e approximation for the matrix exponential. SIAM J. Matrix Anal. Appl.,37(1), 145–170 (2016)

  15. [15]

    Neural Comput.,16(12), 2639–2664 (2004)

    Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: An overview with application to learning methods. Neural Comput.,16(12), 2639–2664 (2004)

  16. [16]

    He, X., Niyogi, P.: Locality preserving projections. In: S. Thrun, L. Saul, B. Sch¨ olkopf (eds.) Advances in Neural Information Processing Systems, vol. 16. MIT Press (2003)

  17. [17]

    Cambridge University Press, Cambridge, UK (1991)

    Horn, R., Johnson, C.: Topics in Matrix Analysis. Cambridge University Press, Cambridge, UK (1991)

  18. [18]

    Hotelling, H.: Relations between two sets of variates. In: S. Kotz, N.L. Johnson (eds.) Breakthroughs in Statistics, Springer Ser. Statist., 162–190. Springer, New York, NY (1992)

  19. [19]

    Hu, J., Liu, X., Wen, Z.W., Yuan, Y.X.: A brief introduction to manifold optimiza- tion. J. Oper. Res. Soc. China,18, 199–248 (2020) Generalized canonical metric for indefinite Stiefel manifold 23

  20. [20]

    Iannazzo, B., Porcelli, M.: The Riemannian Barzilai–Borwein method with nonmono- tone line search and the matrix geometric mean computation. IMA J. Numer. Anal., 38, 495–517 (2018)

  21. [21]

    Jurdjevic, V., Markina, I., Silva Leite, F.: Extremal curves on Stiefel and Grassmann manifolds. J. Geom. Anal.,30, 3948–3978 (2020)

  22. [22]

    Linear Algebra Appl.,216, 139–158 (1995)

    Kovaˇ c-Striko, J., Veseli´ c, K.: Trace minimization and definiteness of symmetric pen- cils. Linear Algebra Appl.,216, 139–158 (1995)

  23. [23]

    Krakowski, K.A., Machado, L., Silva Leite, F., Batista, J.: A modified Casteljau algorithm to solve interpolation problems on Stiefel manifolds. J. Comput. Appl. Math.,311, 84–99 (2017)

  24. [24]

    Mishra, B., Sepulchre, R.: Riemannian preconditioning. SIAM J. Optim.,26(1), 635–660 (2016)

  25. [25]

    Neurocomputing,67, 106–135 (2005)

    Nishimori, Y., Akaho, S.: Learning algorithms utilizing quasi-geodesic flows on the Stiefel manifold. Neurocomputing,67, 106–135 (2005)

  26. [26]

    Sameh, A.H., Wisniewski, J.A.: A trace minimization algorithm for the generalized eigenvalue problem. SIAM J. Numer. Anal.,19(6), 1243–1259 (1982)

  27. [27]

    SpringerBriefs Electr

    Sato, H.: Riemannian Optimization and Its Applications. SpringerBriefs Electr. Com- put. Eng. Springer (2021)

  28. [28]

    Sato, H., Aihara, K.: Cholesky QR-based retraction on the generalized Stiefel mani- fold. Comput. Optim. Appl.,72, 293–308 (2019)

  29. [29]

    Shustin, B., Avron, H.: Riemannian optimization with a preconditioning scheme on the generalized Stiefel manifold. J. Comput. Appl. Math.,423, 114953 (2023)

  30. [30]

    Fields Inst

    Smith, S.T.: Optimization techniques on Riemannian manifolds. Fields Inst. Com- mun.,3, 113–136 (1994)

  31. [31]

    A Riemannian gradient descent method for optimization on the indefinite Stiefel manifold

    Tiep, D.V., Son, N.T.: A Riemannian gradient descent method for optimization on the indefinite Stiefel manifold. https://doi.org/10.48550/arXiv.2410.22068 (2025)

  32. [32]

    Mathematics and Its Applications

    Udriste, C.: Convex Functions and Optimization Methods on Riemannian Manifolds. Mathematics and Its Applications. Springer (1994)

  33. [33]

    Wang, X., Deng, K., Peng, Z., Yan, C.: New vector transport operators extending a Riemannian CG algorithm to generalized Stiefel manifold with low-rank applications. J. Comput. Appl. Math.,451, 116024 (2024)

  34. [34]

    Wilcox, R.M.: Exponential operators and parameter differentiation in quantum physics. J. Math. Phys.,8(4), 962–982 (1967)

  35. [35]

    Yger, F., Berar, M., Gasso, G., Rakotomamonjy, A.: Adaptive canonical correlation analysis based on matrix manifolds. In: J. Langford, J. Pineau (eds.) ICML ’12 Pro- ceedings of the 29th International Coference on International Conference on Machine Learning, 1071–1078. Omnipress, New York, NY, USA (2012)

  36. [36]

    Zhang, H., Hager, W.W.: A nonmonotone line search technique and its application to unconstrained optimization. SIAM J. Optim.,14(4), 1043–1056 (2004)