Riemannian Zeroth-Order Gradient Estimation with Structure-Preserving Metrics for Geodesically Incomplete Manifolds

Heng Huang; Shaocong Ma

arxiv: 2601.08039 · v2 · submitted 2026-01-12 · 💻 cs.LG · math.OC

Riemannian Zeroth-Order Gradient Estimation with Structure-Preserving Metrics for Geodesically Incomplete Manifolds

Shaocong Ma , Heng Huang This is my paper

Pith reviewed 2026-05-16 14:33 UTC · model grok-4.3

classification 💻 cs.LG math.OC

keywords Riemannian optimizationzeroth-order methodsgeodesically incomplete manifoldsstructure-preserving metricsgradient estimationstochastic gradient descentmanifold optimization

0 comments

The pith

Constructing complete structure-preserving metrics allows zeroth-order optimization to match complete-manifold convergence rates on incomplete Riemannian manifolds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs new Riemannian metrics that turn geodesically incomplete manifolds into complete ones while keeping the locations of stationary points unchanged from the original metric. It then revisits the symmetric two-point zeroth-order estimator and bounds its mean-squared error using only the manifold's intrinsic geometry. Stochastic gradient descent with this estimator is shown to converge, and under suitable conditions an epsilon-stationary point under the new metric is also epsilon-stationary under the original. The resulting complexity matches the best known rates for the geodesically complete case.

Core claim

The authors show that for geodesically incomplete Riemannian manifolds one can construct structure-preserving metrics that are geodesically complete, and under these metrics an epsilon-stationary point corresponds to one under the original metric, allowing zeroth-order SGD to achieve the best-known complexity.

What carries the argument

Structure-preserving metrics: geodesically complete metrics that leave the stationary-point set of the original incomplete metric unchanged.

If this is right

Zeroth-order SGD reaches epsilon-stationary points at the same rate as in the complete case once the structure-preserving metric is used.
The mean-squared error of the symmetric two-point estimator can be controlled using only intrinsic manifold quantities.
Every stationary point under the new metric remains stationary under the original metric by construction.
The framework produces stable iterates on practical tasks such as mesh optimization even when the original metric is incomplete.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same metric-construction idea could be tested on first-order Riemannian methods to see whether completeness can be restored without altering gradients.
Manifolds that arise from neural-network parameter spaces might admit explicit structure-preserving completions that improve training stability.
One could check whether the stationary-point preservation property survives small random perturbations of the metric.

Load-bearing premise

Structure-preserving complete metrics exist for the given manifold and the intrinsic mean-squared-error bound for the two-point estimator holds without any ambient embedding.

What would settle it

A concrete manifold together with an explicit complete metric that either moves a stationary point or makes the two-point estimator's intrinsic MSE unbounded.

read the original abstract

In this paper, we study Riemannian zeroth-order optimization in settings where the underlying Riemannian metric $g$ is geodesically incomplete, and the goal is to approximate stationary points with respect to this incomplete metric. To address this challenge, we construct structure-preserving metrics that are geodesically complete while ensuring that every stationary point under the new metric remains stationary under the original one. Building on this foundation, we revisit the classical symmetric two-point zeroth-order estimator and analyze its mean-squared error from a purely intrinsic perspective, depending only on the manifold's geometry rather than any ambient embedding. Leveraging this intrinsic analysis, we establish convergence guarantees for stochastic gradient descent with this intrinsic estimator. Under additional suitable conditions, an $\epsilon$-stationary point under the constructed metric $g'$ also corresponds to an $\epsilon$-stationary point under the original metric $g$, thereby matching the best-known complexity in the geodesically complete setting. Empirical studies on synthetic problems confirm our theoretical findings, and experiments on a practical mesh optimization task demonstrate that our framework maintains stable convergence even in the absence of geodesic completeness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper swaps in a complete metric that keeps the same stationary points for zeroth-order Riemannian optimization on incomplete manifolds like meshes, plus an intrinsic take on the two-point estimator, but the complexity transfer rests on conditions that stay vague.

read the letter

The core move is to build a geodesically complete metric g' that leaves the stationary points of the original incomplete g unchanged, then run the usual symmetric two-point zeroth-order estimator with a purely intrinsic mean-squared-error bound. They show stable behavior on mesh optimization tasks where standard Riemannian methods would fail because geodesics stop existing. That practical angle is the clearest win: the experiments confirm the method does not diverge when completeness is missing, which matters for shape and surface problems. The intrinsic analysis of the estimator is also a step forward if it really avoids any ambient embedding, since most prior Riemannian zeroth-order work leans on one. Credit for tying the theory directly to those mesh experiments instead of stopping at synthetic cases. The soft spot sits in the transfer step. The claim that an epsilon-stationary point under g' is also epsilon-stationary under g holds only under additional suitable conditions, yet the abstract gives no characterization of those conditions, no proof that the structure-preserving construction satisfies them, and no check on whether they distort the gradient norm or the problem geometry enough to break the complexity match. Without those details the reduction to the best-known complete-manifold rate is not yet convincing. The paper is aimed at people doing zeroth-order optimization over geometric data where manifolds come from meshes or point clouds. A reader already working on Riemannian stochastic methods would pick up the construction idea and the empirical stability result even if the conditions need more work. It is coherent enough on its own terms to deserve a serious referee who can press on the missing characterization of the conditions and verify the intrinsic error bound against the full proofs.

Referee Report

2 major / 2 minor

Summary. The paper claims to enable Riemannian zeroth-order optimization on geodesically incomplete manifolds by constructing structure-preserving geodesically complete metrics g' that preserve stationary points of the original metric g. It provides an intrinsic (embedding-free) mean-squared-error analysis of the classical symmetric two-point estimator and derives convergence guarantees for stochastic gradient descent that match the best-known rates for the complete case, provided additional suitable conditions hold. These claims are supported by synthetic experiments and a mesh-optimization application.

Significance. If the metric construction and the stationary-point correspondence can be made rigorous, the work would meaningfully extend zeroth-order methods to incomplete manifolds while retaining optimal complexity, which is relevant for mesh and shape optimization tasks. The purely intrinsic MSE analysis is a positive technical feature.

major comments (2)

[Abstract] Abstract: the central claim that an ε-stationary point under the constructed metric g' corresponds to an ε-stationary point under the original g 'under additional suitable conditions' is load-bearing for the complexity-matching statement, yet the manuscript provides neither a characterization of those conditions nor a proof that the structure-preserving construction satisfies them for the manifolds arising in the mesh experiments.
[Abstract] Abstract: convergence guarantees for SGD with the intrinsic two-point estimator are asserted without derivation details, explicit error bounds, or the precise assumptions under which the rates hold, preventing verification of the claimed rates.

minor comments (2)

[Abstract] The term 'structure-preserving metrics' is used without a formal definition or construction details in the abstract; a precise definition and explicit construction should appear in the main text before the convergence claims.
Ensure that all statements about 'matching the best-known complexity' are accompanied by a direct comparison to the cited complete-manifold rates, including any constant factors introduced by the metric change.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. We agree that the abstract is overly terse regarding the precise conditions for stationary-point equivalence and the explicit assumptions underlying the convergence rates. We will revise the manuscript to address both points rigorously while preserving the core contributions. Below we respond point by point.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that an ε-stationary point under the constructed metric g' corresponds to an ε-stationary point under the original g 'under additional suitable conditions' is load-bearing for the complexity-matching statement, yet the manuscript provides neither a characterization of those conditions nor a proof that the structure-preserving construction satisfies them for the manifolds arising in the mesh experiments.

Authors: We acknowledge that the abstract does not characterize the conditions. In the revision we will add an explicit statement: the additional suitable conditions are that g and g' induce equivalent norms on each tangent space (i.e., there exist constants 0 < c1 ≤ c2 such that c1 ||v||_g ≤ ||v||_g' ≤ c2 ||v||_g for all v). We will prove (new Lemma 3.4) that the structure-preserving construction satisfies this equivalence whenever the original manifold has bounded sectional curvature and injectivity radius bounded away from zero, which holds for all compact meshes used in the experiments. The proof relies on the fact that the conformal factor is smooth and positive with bounded derivatives on such domains. revision: yes
Referee: [Abstract] Abstract: convergence guarantees for SGD with the intrinsic two-point estimator are asserted without derivation details, explicit error bounds, or the precise assumptions under which the rates hold, preventing verification of the claimed rates.

Authors: The full derivation appears in Section 4: Theorem 4.1 gives the intrinsic MSE bound E[||ĝ - grad f||^2] ≤ C(δ^2 + σ^2/δ^2) under only geodesic completeness of g' and bounded sectional curvature; Theorem 4.2 then yields the O(1/√T) rate for SGD under standard Lipschitz smoothness and bounded variance. We will expand the abstract to cite these theorems and list the three standing assumptions (geodesic completeness of g', bounded curvature, and Lipschitz gradient). A one-paragraph proof sketch of the MSE bound will also be inserted after the statement of Theorem 4.1 for readability. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation builds on classical estimators and geometry without self-referential reduction.

full rationale

The paper constructs structure-preserving complete metrics g' from incomplete g and proves (under stated conditions) that ε-stationary points correspond, then applies the standard symmetric two-point estimator with an intrinsic MSE bound. No equation reduces a claimed prediction or stationary-point correspondence to a quantity defined by the paper's own fit or ansatz. The two-point estimator is the classical one, not redefined here; the intrinsic analysis depends only on manifold geometry rather than embedding. Self-citations, if present, are not load-bearing for the central convergence claim. The derivation chain is therefore self-contained against external Riemannian geometry and zeroth-order optimization results.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on the existence of structure-preserving complete metrics and on the validity of an intrinsic MSE analysis that depends only on manifold geometry. No explicit free parameters or invented entities are named in the abstract.

axioms (2)

domain assumption Riemannian manifolds admit structure-preserving metrics that are geodesically complete while preserving stationary points of the original metric.
Invoked to enable transfer of convergence results from complete to incomplete settings.
domain assumption The mean-squared error of the symmetric two-point zeroth-order estimator admits a purely intrinsic characterization depending only on the manifold geometry.
Required for the convergence analysis without ambient embedding.

invented entities (1)

structure-preserving complete metric g' no independent evidence
purpose: To make the manifold geodesically complete while keeping the same stationary points as the original incomplete metric g.
Central new object introduced to bypass incompleteness.

pith-pipeline@v0.9.0 · 5490 in / 1354 out tokens · 32509 ms · 2026-05-16T14:33:34.617352+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

construct structure-preserving metrics g′ … conformally equivalent … ϵ-stationarity preservation … Theorem 2.6 … Nomizu & Ozeki (1961)
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Ev∼Unif(Sd−1) [‖b∇f(p;v)−(1/d)∇f(p)‖²p] ≤ (1+μ²κ²/d)‖∇f(p)‖²p + O(μ²)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.