pith. sign in

arxiv: 2605.22736 · v1 · pith:K52NNVZNnew · submitted 2026-05-21 · 🧮 math.OC · cs.LG· cs.NA· math.DG· math.NA

Optimization over the intersection of manifolds

Pith reviewed 2026-05-22 04:00 UTC · model grok-4.3

classification 🧮 math.OC cs.LGcs.NAmath.DGmath.NA
keywords manifold optimizationintersection of manifoldsintrinsic transversalityclean intersectionretractionconvergence ratefirst-order stationaritysparse optimization
0
0 comments X

The pith

The paper proves that clean intersection and intrinsic transversality are equivalent for two manifolds, enabling a tractable tangent space projection and a convergent geometric optimization method.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Optimization over the intersection of two manifolds is difficult because the geometry of the feasible set is coupled. The authors show that two standard regularity conditions on the pair of manifolds are in fact identical. This equivalence makes the orthogonal projection onto the tangent space of the intersection computable in a direct way. They then design an algorithm that keeps every iterate on one manifold, using one direction to approach the second manifold and a second orthogonal direction to reduce the objective function. Under the regularity condition the method supplies explicit rates for both feasibility and optimality measures while guaranteeing that limit points are first-order stationary.

Core claim

We prove that the regularities clean intersection and intrinsic transversality are equivalent, which yields a tractable projection onto the tangent space of the intersection. Therefore we propose a geometric method that employs a retraction on only one manifold and updates the iterate along two orthogonal directions. The iterates stay on one manifold, and the two directions are responsible for asymptotically approaching the other manifold and decreasing the objective function, respectively. Under intrinsic transversality we derive the convergence rate for both the feasibility and optimality measures, and show that every accumulation point is first-order stationary.

What carries the argument

The equivalence of clean intersection and intrinsic transversality, which supplies a tractable projection onto the tangent space of the intersection and supports the two orthogonal update directions.

If this is right

  • The projection onto the tangent space of the intersection becomes explicitly computable.
  • The algorithm produces iterates whose distance to the second manifold and whose objective decrease both admit explicit convergence rates.
  • Every limit point of the generated sequence satisfies the first-order stationarity condition on the intersection.
  • The same geometric splitting works on standard sparse and low-rank problems that can be cast as manifold intersection tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same equivalence may let analysts replace ad-hoc penalty or projection steps with the two-direction update in other manifold-constrained problems.
  • The orthogonal-direction splitting suggests a natural way to incorporate second-order information once a retraction and vector transport are chosen.
  • Success on hyperbolic embedding approximation indicates the method can be tested directly on graph-layout tasks without additional manifold-specific code.

Load-bearing premise

The manifolds satisfy the intrinsic transversality condition, equivalently that their intersection is clean.

What would settle it

A concrete pair of manifolds whose intersection is clean yet the tangent-space projection onto the intersection fails to be well-defined, or a run of the algorithm in which the feasibility measure does not converge at the stated rate.

read the original abstract

Optimization over the intersection of two manifolds arises in a broad range of applications, but is hindered by the coupled geometry of the feasible region. In this paper, we prove that the regularities -- clean intersection and intrinsic transversality -- are equivalent, which yields a tractable projection onto the tangent space of the intersection. Therefore, we propose a geometric method that employs a retraction on only one manifold and updates the iterate along two orthogonal directions. Specifically, the iterates stay on one manifold, and the two directions are responsible for asymptotically approaching the other manifold and decreasing the objective function, respectively. Under intrinsic transversality, we derive the convergence rate for both the feasibility and optimality measures, and show that every accumulation point is first-order stationary. Numerical experiments on problems stemming from sparse and low-rank optimization, including fitting spherical data, approximating hyperbolic embeddings on real data, and computing compressed modes, demonstrate the effectiveness of the proposed method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proves that clean intersection and intrinsic transversality are equivalent for two manifolds, which allows a tractable projection onto the tangent space of the intersection. Based on this, it develops a geometric optimization algorithm that uses a retraction on only one manifold and performs updates in two orthogonal directions: one for feasibility with respect to the second manifold and one for objective decrease. Convergence rates for feasibility and optimality measures are derived under the intrinsic transversality condition, every accumulation point is shown to be first-order stationary, and numerical experiments on sparse/low-rank optimization problems, spherical data fitting, hyperbolic embeddings, and compressed modes are presented to demonstrate effectiveness.

Significance. If the equivalence and convergence results hold, the work advances manifold-constrained optimization by reducing the need for retractions on both manifolds and providing explicit rates under a verifiable regularity condition. The equivalence theorem and the orthogonal-direction splitting are notable strengths, as is the application to concrete problems in sparse and low-rank recovery. The numerical validation on real data adds practical value.

major comments (2)
  1. [§3] §3 (Algorithm 1 and surrounding text): The claimed tractability of the projection onto T(M1 ∩ M2) is central to the orthogonal splitting of the update directions, yet the manuscript supplies only an existence argument from the equivalence rather than an explicit, closed-form or efficiently computable expression (e.g., via normal-space decomposition). Without this, the per-iteration cost and the orthogonality guarantee cannot be verified from the given description.
  2. [Theorem 4.3] Theorem 4.3 (accumulation-point stationarity): The proof invokes intrinsic transversality to ensure the limit point satisfies the first-order condition on the intersection, but the argument does not address the case where the transversality constant degenerates along the sequence; a quantitative bound on the constant would be needed to make the stationarity claim robust.
minor comments (2)
  1. [Abstract] Abstract: The phrase 'clean intersection' appears without a one-sentence reminder of its definition or a pointer to the section where it is introduced.
  2. [Numerical experiments] Numerical section: Tables reporting iteration counts and final objective values would benefit from reporting the number of random initializations and standard deviations to allow assessment of variability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and constructive comments. We address each major comment below and outline the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§3] §3 (Algorithm 1 and surrounding text): The claimed tractability of the projection onto T(M1 ∩ M2) is central to the orthogonal splitting of the update directions, yet the manuscript supplies only an existence argument from the equivalence rather than an explicit, closed-form or efficiently computable expression (e.g., via normal-space decomposition). Without this, the per-iteration cost and the orthogonality guarantee cannot be verified from the given description.

    Authors: We agree that greater explicitness would improve verifiability. The equivalence of clean intersection and intrinsic transversality established in Theorem 3.1 directly yields an explicit formula: the tangent space T(M1 ∩ M2) is the orthogonal complement of N_{M1} + N_{M2}. Consequently, the orthogonal projection onto this tangent space is obtained by solving a small linear least-squares problem whose matrix is assembled from orthonormal bases of the two normal spaces (whose dimensions equal the respective codimensions). When these codimensions are modest—as is the case in all numerical examples in the paper—the solve is inexpensive and can be performed with standard dense linear-algebra routines. We will add a dedicated paragraph and a short algorithmic box in §3 that spells out this construction, states its O(d^3) cost for normal-space dimension d, and verifies that the two update directions remain orthogonal by construction. revision: yes

  2. Referee: [Theorem 4.3] Theorem 4.3 (accumulation-point stationarity): The proof invokes intrinsic transversality to ensure the limit point satisfies the first-order condition on the intersection, but the argument does not address the case where the transversality constant degenerates along the sequence; a quantitative bound on the constant would be needed to make the stationarity claim robust.

    Authors: We thank the referee for highlighting this subtlety. Intrinsic transversality is a pointwise condition: at any point x in M1 ∩ M2 there exists δ(x) > 0 such that the angle between the normal spaces satisfies the stated lower bound. In the proof we pass to the limit and invoke the condition at the accumulation point x*, where δ(x*) > 0 by hypothesis. Nevertheless, to preclude any possibility that δ(x_k) → 0 along the sequence, we will strengthen the statement of Theorem 4.3 by adding the standing assumption that the transversality constant is uniformly bounded away from zero in a neighborhood of x*. This is a standard, mild regularity hypothesis in the manifold-optimization literature and will be accompanied by a brief remark explaining why it is satisfied in the concrete applications considered. The proof will be updated to cite the uniform bound explicitly when controlling the limiting first-order condition. revision: yes

Circularity Check

0 steps flagged

No circularity: equivalence proof and algorithm derivation are self-contained

full rationale

The paper establishes the equivalence of clean intersection and intrinsic transversality via a mathematical proof, which directly supports the existence of a tractable tangent-space projection for the intersection. This equivalence is then used to construct the update directions and derive convergence rates under the stated assumption. No quoted step reduces a claimed prediction or result to a fitted parameter, self-definition, or unverified self-citation chain; the derivation relies on standard manifold geometry and the internal proof rather than renaming known results or smuggling ansatzes. The central claims remain independent of the algorithm's own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard assumptions from Riemannian geometry and manifold optimization without introducing new free parameters or invented entities.

axioms (1)
  • domain assumption The two manifolds are smooth Riemannian manifolds whose intersection satisfies the regularity conditions of clean intersection or intrinsic transversality.
    This is the key regularity invoked to define the tangent space projection and derive convergence.

pith-pipeline@v0.9.0 · 5690 in / 1275 out tokens · 59157 ms · 2026-05-22T04:00:30.033375+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · 1 internal anchor

  1. [1]

    Ablin and G

    [1]P. Ablin and G. Peyré,Fast and accurate optimization on the orthogonal manifold without retraction, in International Conference on Artificial Intelligence and Statistics, PMLR, 2022, pp. 5636–5657. [2]P. Ablin, S. V ary, B. Gao, and P.-A. Absil,Infeasible deterministic, stochastic, and variance-reduction algorithms for optimization under orthogonality ...

  2. [2]

    Andersson and M

    [4]F. Andersson and M. Carlsson,Alternating projections on nontangential manifolds, Con- structive Approximation, 38 (2013), pp. 489–525. [5]R. Andreani, E. G. Birgin, J. M. Martínez, and M. L. Schuverdt,On augmented Lagrangian methods with general lower-level constraints, SIAM Journal on Optimization, 18 (2008), pp. 1286–1309. [6]R. Andreani, K. R. Couto...

  3. [3]

    Boumal, P.-A

    [13]N. Boumal, P.-A. Absil, and C. Cartis,Global rates of convergence for nonconvex opti- mization on manifolds, IMA Journal of Numerical Analysis, 39 (2019), pp. 1–33. [14]N. Boumal, B. Mishra, P.-A. Absil, and R. Sepulchre,Manopt, a Matlab toolbox for op- timization on manifolds, The Journal of Machine Learning Research, 15 (2014), pp. 1455–

  4. [4]

    Retractions by Alternating Projections

    [15]S. Budzinskiy,Quasioptimal alternating projections and their use in low-rank approximation of matrices and tensors, Numerische Mathematik, 157 (2025), pp. 1491–1535. [16]T. P. Cason, P.-A. Absil, and P. V an Dooren,Iterative methods for low rank approx- imation of graph similarity matrices, Linear Algebra and its Applications, 438 (2013), pp. 1863–188...

  5. [5]

    [19]S. Chen, S. Ma, A. Man-Cho So, and T. Zhang,Proximal gradient method for nonsmooth optimization over the Stiefel manifold, SIAM Journal on Optimization, 30 (2020), pp. 210–

  6. [6]

    [20]M. Chu, N. Del Buono, L. Lopez, and T. Politi,On the low-rank approximation of data on the unit sphere, SIAM Journal on Matrix Analysis and Applications, 27 (2005), pp. 46–60. [21]D. Drusvyatskiy, A. D. Ioffe, and A. S. Lewis,Transversality and alternating projections for nonconvex sets, Foundations of Computational Mathematics, 15 (2015), pp. 1637–16...

  7. [7]

    [28]A. D. Ioffe,Transversality in variational analysis, Journal of Optimization Theory and Ap- plications, 174 (2017), pp. 343–366. [29]P. Ja w anpuria, M. Meghw anshi, and B. Mishra,Low-rank approximations of hyperbolic embeddings, in 2019 IEEE 58th conference on decision and control, IEEE, 2019, pp. 7159–

  8. [8]

    [30]X. Jia, C. Kanzow, P. Mehlitz, and G. W achsmuth,An augmented Lagrangian method for optimization problems with structured geometric constraints, Mathematical Program- ming, 199 (2023), pp. 1365–1415. [31]Z. Lai and A. Yoshise,Riemannian interior point methods for constrained optimization on manifolds, Journal of Optimization Theory and Applications, 2...

  9. [9]

    Levin, J

    [33]E. Levin, J. Kileel, and N. Boumal,The effect of smooth parametrizations on nonconvex optimization landscapes, Mathematical Programming, 209 (2025), pp. 63–111. [34]A. S. Lewis, D. R. Luke, and J. Malick,Local linear convergence for alternating and averaged nonconvex projections, Foundations of Computational Mathematics, 9 (2009), pp. 485–513. [35]A. ...

  10. [10]

    [37]X. Li, N. Xiu, and S. Zhou,Matrix optimization over low-rank spectral sets: stationary points and local and global minimizers, Journal of Optimization Theory and Applications, 184 (2020), pp. 895–930. [38]C. Liu and N. Boumal,Simple algorithms for optimization on Riemannian manifolds with constraints, Applied Mathematics & Optimization, 82 (2020), pp....

  11. [11]

    [40]G. A. Miller,WordNet: a lexical database for English, Communications of the ACM, 38 (1995), pp. 39–41. [41]V. Neumann,Functional operators, The Geometry of Orthogonal Spaces, (1950). [42]M. Nickel and D. Kiela,Poincaré embeddings for learning hierarchical representations, Advances in neural information processing systems, 30 (2017). [43]J. Nocedal and...

  12. [12]

    Noll and A

    [45]D. Noll and A. Rondepierre,On local convergence of the method of alternating projections, Foundations of Computational Mathematics, 16 (2016), pp. 425–455. [46]M. Obara, T. Okuno, and A. Takeda,Sequential quadratic optimization for nonlinear op- timization problems on Riemannian manifolds, SIAM Journal on Optimization, 32 (2022), pp. 822–853. [47]G. O...

  13. [13]

    Ozolin , š, R

    [48]V. Ozolin , š, R. Lai, R. Caflisch, and S. Osher,Compressed modes for variational prob- lems in mathematics and physics, Proceedings of the National Academy of Sciences, 110 (2013), pp. 18368–18373. [49]R. Peng, C. Zhu, B. Gao, X. W ang, and Y.-x. Yuan,Normalized tensor train decompo- sition, arXiv preprint arXiv:2511.04369, (2025). [50]R. T. Rockafel...

  14. [14]

    26Y. YANG, B. GAO, AND Y.-X. YUAN [52]J. B. Rosen,The gradient projection method for nonlinear programming. Part I. Linear con- straints, Journal of the Society for Industrial and Applied Mathematics, 8 (1960), pp. 181–

  15. [15]

    [53]J. B. Rosen,The gradient projection method for nonlinear programming. Part II. Nonlin- ear constraints, Journal of the Society for Industrial and Applied Mathematics, 9 (1961), pp. 514–532. [54]S. Schechtman, D. Tiapkin, M. Muehlebach, and E. Moulines,Orthogonal directions constrained gradient method: from non-linear equality constraints to Stiefel ma...

  16. [16]

    V andereycken,Low-rank matrix completion by Riemannian optimization, SIAM Journal on Optimization, 23 (2013), pp

    [58]B. V andereycken,Low-rank matrix completion by Riemannian optimization, SIAM Journal on Optimization, 23 (2013), pp. 1214–1236. [59]S. V ary, P. Ablin, B. Gao, and P.-A. Absil,Optimization without retraction on the random generalized Stiefel manifold, in Proceedings of the 41st International Conference on Machine Learning, vol. 235, PMLR, 2024, pp. 49...