pith. sign in

arxiv: 2410.22068 · v3 · pith:HKF5WRBMnew · submitted 2024-10-29 · 🧮 math.OC

A Riemannian gradient descent method for optimization on the indefinite Stiefel manifold

Pith reviewed 2026-05-23 18:42 UTC · model grok-4.3

classification 🧮 math.OC
keywords indefinite Stiefel manifoldRiemannian optimizationgradient descentCayley retractionquadratic matrix constrainteigenvalue problemsProcrustes problem
0
0 comments X

The pith

A Riemannian gradient descent method on the indefinite Stiefel manifold X^T A X = J converges globally to critical points.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper equips the set of matrices satisfying the quadratic constraint X^T A X = J with a Riemannian metric and uses it to define a gradient descent algorithm. The method relies on a retraction constructed via the Cayley transform and comes with a proof of global convergence for smooth objective functions. This construction recovers the known orthogonal and generalized Stiefel cases while opening the same geometric treatment to other quadratic constraints that previously lacked it. Readers would care because the framework directly yields algorithms for trace-minimization eigenvalue problems and matrix least-squares tasks that arise in applications.

Core claim

The authors show that the indefinite Stiefel manifold defined by X^T A X = J admits a Riemannian metric, that the associated tangent-space projections and gradients can be written explicitly, that the Cayley transform supplies a retraction, and that the resulting Riemannian gradient descent iteration is globally convergent; the same objects specialize to the classical orthogonal and generalized Stiefel manifolds and furnish solvers for several previously untreated constrained matrix problems.

What carries the argument

The indefinite Stiefel manifold equipped with its induced Riemannian metric and the Cayley-transform retraction that maps tangent vectors to feasible points while preserving the quadratic constraint.

If this is right

  • The same algorithm solves the linear-response eigenvalue problem by trace minimization.
  • It produces solutions to the matrix least-squares formulation of the generalized Procrustes problem under the given quadratic constraint.
  • Global convergence holds for any smooth objective, not merely the trace objectives treated in the examples.
  • When A is the identity and J = I the method reduces to standard Riemannian gradient descent on the orthogonal Stiefel manifold.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The retraction could be swapped for other first-order retractions to trade geometric fidelity for computational speed in large-scale instances.
  • The same manifold geometry might support second-order methods such as Riemannian trust-region algorithms without altering the constraint handling.
  • Because the metric is induced by a fixed matrix A, preconditioning strategies that adapt to the spectrum of A could improve practical iteration counts.

Load-bearing premise

The level set X^T A X = J is a smooth manifold that carries a Riemannian metric for which the Cayley retraction is well-defined.

What would settle it

A numerical run of the algorithm on any instance where the iterates fail to approach a point with vanishing Riemannian gradient would contradict the global-convergence theorem.

Figures

Figures reproduced from arXiv: 2410.22068 by Dinh Van Tiep, Nguyen Thanh Son.

Figure 6
Figure 6. Figure 6 [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
read the original abstract

We consider the optimization problem with a generally quadratic matrix constraint of the form $X^TAX = J$, where $A$ is a given nonsingular, symmetric $n\times n$ matrix and $J$ is a given $k\times k$ symmetric matrix, with $k\leq n$, satisfying $J^2 = I_k$. Since the feasible set constitutes a differentiable manifold, called the indefinite Stiefel manifold, we approach this problem within the framework of Riemannian optimization. Namely, we first equip the manifold with a Riemannian metric and construct the associated geometric structure, then propose a retraction based on the Cayley transform, and finally suggest a Riemannian gradient descent method using the attained materials, whose global convergence is guaranteed. Our results not only cover the known cases, the orthogonal and generalized Stiefel manifolds, but also provide a Riemannian optimization solution for other constrained problems which has not been investigated. As applications, we consider, via trace minimization, several eigenvalue problems of symmetric positive definite matrix pencils, including the linear response eigenvalue problem, and a matrix least square problem, a general framework for the Procrustes problem and constrained matrix equations. The presented numerical results justify the theoretical findings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper considers optimization subject to the quadratic constraint X^T A X = J (A symmetric nonsingular, J symmetric with J^2 = I_k) and treats the feasible set as the indefinite Stiefel manifold. It equips this manifold with a Riemannian metric, constructs associated geometric objects, proposes a Cayley-transform retraction, and develops a Riemannian gradient descent algorithm whose global convergence is asserted. The framework is claimed to recover the orthogonal and generalized Stiefel cases and is applied to trace-minimization eigenvalue problems (including linear-response eigenproblems) and matrix least-squares/Procrustes problems, with numerical illustrations.

Significance. If the retraction lands exactly on the manifold and the convergence argument holds, the work supplies the first systematic Riemannian treatment of a broader family of quadratic constraints that includes previously studied Stiefel manifolds as special cases. The explicit global-convergence guarantee and the concrete applications to symmetric-definite pencils constitute the main potential contributions.

major comments (2)
  1. [Retraction construction] Retraction section (exact location not numbered in the provided abstract but central to §3–4): the algebraic verification that the proposed Cayley retraction R_X(ξ) satisfies (R_X(ξ))^T A R_X(ξ) = J exactly, rather than only to first order, is not shown for general indefinite A. The cancellation that holds when A = I does not automatically extend when A possesses negative eigenvalues; this identity is load-bearing for every subsequent Riemannian step and for the global-convergence claim.
  2. [Convergence analysis] Convergence theorem (likely §4): the global-convergence proof assumes that every iterate remains on the manifold and that the retraction is well-defined and smooth in a neighborhood of the tangent space. If the retraction identity fails for indefinite A, both the manifold membership and the descent property used in the proof are compromised.
minor comments (2)
  1. [Geometric structure] Notation for the Riemannian metric and the projection onto the tangent space should be introduced with explicit formulas (e.g., the inner product <·,·>_X) before the retraction is defined.
  2. [Special cases] The statement that the results “cover the known cases” would be strengthened by a short corollary explicitly recovering the standard Stiefel retraction when A = I.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The two major comments both concern the exact manifold membership of the proposed Cayley retraction for general indefinite A. We address them point by point below and will strengthen the manuscript accordingly.

read point-by-point responses
  1. Referee: [Retraction construction] Retraction section (exact location not numbered in the provided abstract but central to §3–4): the algebraic verification that the proposed Cayley retraction R_X(ξ) satisfies (R_X(ξ))^T A R_X(ξ) = J exactly, rather than only to first order, is not shown for general indefinite A. The cancellation that holds when A = I does not automatically extend when A possesses negative eigenvalues; this identity is load-bearing for every subsequent Riemannian step and for the global-convergence claim.

    Authors: We agree that an explicit algebraic verification for arbitrary symmetric nonsingular A (including those with negative eigenvalues) was not written out in sufficient detail. In the revised manuscript we will insert a direct computation showing that the Cayley retraction, defined via the appropriate skew-symmetric matrix with respect to the A-inner product, satisfies the quadratic constraint exactly. The argument proceeds by verifying that the retraction formula preserves the defining relation X^T A X = J through cancellation that relies only on the skew-symmetry with respect to A and the property J^2 = I_k; the sign pattern of A does not enter the cancellation. revision: yes

  2. Referee: [Convergence analysis] Convergence theorem (likely §4): the global-convergence proof assumes that every iterate remains on the manifold and that the retraction is well-defined and smooth in a neighborhood of the tangent space. If the retraction identity fails for indefinite A, both the manifold membership and the descent property used in the proof are compromised.

    Authors: Once the exact retraction identity is established (as outlined above), every iterate lies on the manifold by construction, so the descent property and the global-convergence argument remain valid. We will add a short remark immediately after the retraction theorem that explicitly invokes the newly supplied identity to justify manifold membership at every step, thereby closing the logical gap noted by the referee. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation chain is self-contained

full rationale

The paper defines the indefinite Stiefel manifold from the quadratic constraint X^T A X = J with A nonsingular symmetric, then equips it with a Riemannian metric, derives the associated geometric objects, proposes a Cayley retraction, and analyzes RGD convergence, all presented as direct constructions. No quoted step reduces a claimed prediction or uniqueness result to a fitted parameter or self-citation by definition. The coverage of orthogonal/generalized Stiefel cases is an extension rather than a load-bearing premise that collapses the new result. The central claims remain independent of the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the central claim rests on the unelaborated premise that a suitable Riemannian metric and retraction exist on the constraint set.

pith-pipeline@v0.9.0 · 5739 in / 1135 out tokens · 24126 ms · 2026-05-23T18:42:39.149989+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A generalized canonical metric for optimization on the indefinite Stiefel manifold

    math.OC 2025-09 unverdicted novelty 5.0

    A generalized canonical Riemannian metric is defined on the indefinite Stiefel manifold, with an associated quasi-geodesic and retraction, for use in Riemannian gradient descent.