pith. sign in

arxiv: 2510.21033 · v2 · submitted 2025-10-23 · 🧮 math.OC · cs.LG· math.DG

Iso-Riemannian Optimization on Learned Data Manifolds

Pith reviewed 2026-05-18 04:01 UTC · model grok-4.3

classification 🧮 math.OC cs.LGmath.DG
keywords iso-Riemannian optimizationlearned data manifoldsiso-connectionconvex optimizationRiemannian gradient descentbarycentre computationpullback manifoldfirst-order methods
0
0 comments X

The pith

Iso-convexity from the iso-connection lets Euclidean convex functions be optimized with convergence guarantees on learned data manifolds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

High-dimensional data typically lies on low-dimensional structures that can be endowed with a Riemannian metric learned from finite samples. Standard Riemannian optimization lacks convergence guarantees in this setting because Euclidean convex functions are generally not geodesically convex and their gradient fields are not monotone with respect to the Levi-Civita connection. The paper defines iso-convexity, iso-monotonicity, and iso-Lipschitz continuity via the iso-connection, which are compatible with the Euclidean properties on the pullback manifold. These notions support an iso-Riemannian descent algorithm whose convergence is proved for barycentre computation and for minimization of Euclidean convex objectives. Experiments on synthetic data and MNIST show that the resulting barycentres are interpretable and that the method yields efficient solutions to inverse problems in high dimensions.

Core claim

The central claim is that the iso-connection induces iso-convexity, iso-monotonicity, and iso-Lipschitz continuity that reconcile learned Riemannian geometry with Euclidean convexity. Under these conditions an iso-Riemannian descent scheme converges to minimizers of Euclidean convex functions on the pullback manifold, even though the same functions need not be geodesically convex with respect to the Levi-Civita connection; the same assumptions guarantee convergence for iso-Riemannian barycentre computation.

What carries the argument

The iso-connection on the learned pullback manifold, which replaces the Levi-Civita connection and thereby defines iso-geodesics and parallel transport that preserve compatibility with Euclidean convexity.

If this is right

  • Iso-Riemannian barycentre computation on learned manifolds becomes feasible with explicit convergence rates.
  • First-order optimization of Euclidean convex functions over pullback manifolds admits provable efficiency guarantees.
  • Clustering and inverse problems on high-dimensional data acquire geometric interpretations and improved numerical stability.
  • The framework supplies a canonical vector field for descent that standard Riemannian theory does not identify in this setting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Alternative connections could resolve similar convexity mismatches in other manifold-learning pipelines beyond the iso-connection.
  • The same iso-notions may extend naturally to stochastic or constrained variants of the descent scheme.
  • Comparable pullback constructions might allow second-order or non-smooth methods on the same learned geometries.

Load-bearing premise

The iso-connection must induce convexity and monotonicity that remain well-defined on the learned pullback manifold and sufficient to guarantee convergence of the first-order scheme.

What would settle it

A concrete counter-example would be a learned manifold together with an explicitly Euclidean convex objective for which the iso-Riemannian descent iterates diverge or fail to approach the known minimizer.

Figures

Figures reproduced from arXiv: 2510.21033 by Melanie Weber, Willem Diepeveen.

Figure 1
Figure 1. Figure 1: If pθ ∗ = pdata holds, any geodesic γ φθ∗ x,y (orange) between data points x, y ∈ R 2 (blue) moves through regions with higher likelihood – visualized by the level set curves of pdata – than the end points. In reality, we cannot expect the equality pθ ∗ = pdata to hold exactly. Instead, the best we can hope for is that pθ ∗ ≈ pdata – especially if the data is inherently multimodal. Nevertheless, minimizing… view at source ↗
Figure 2
Figure 2. Figure 2: Under the pullback structure (R,(·, ·) φ) with φ(x) = sinh(x + 1) the convex function f(x) = 1 2 x 2 is not geodesically convex. 2.3 Iso-Riemannian geometry on R d The difficulties arising from the non-constant ℓ 2 -speed of geodesics for optimization – visualized in [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Barycentre results on the synthetic river and spiral data sets under modeled pullback structures. Both the [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Barycentre results for MNIST under different geometries. Both the Riemannian and iso-barycentre (Algo [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Clustering results on the synthetic river and spiral data sets under modeled pullback structures. Both the Rie [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Optimization results of an inverse problem (left) – visualized by its level sets – under two different modeled [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Optimization results of denoising the number four (left) under a learned geodesic submanifold constraint. [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Linear convergence to the iso-barycentre of the two synthetic data sets, which is in line with our theory. [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Linear convergence to the iso-barycentre on the MNIST data set, which is in line with our theory. [PITH_FULL_IMAGE:figures/full_fig_p028_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Convergence results on the synthetic data manifold. We observe monotone decay of the loss (left) and local [PITH_FULL_IMAGE:figures/full_fig_p029_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Convergence results on the learned mnist data manifold. We observe monotone decay of the loss (left) and [PITH_FULL_IMAGE:figures/full_fig_p029_11.png] view at source ↗
read the original abstract

High-dimensional data with intrinsic low-dimensional structure is ubiquitous in machine learning and data science. While various approaches allow one to learn a data manifold with a Riemannian structure from finite samples, performing downstream tasks such as optimization directly on these learned manifolds remains challenging. In particular, Euclidean convex functions cannot be assumed to be geodesically convex, and the associated Riemannian gradient fields are generally not monotone in the classical Riemannian sense. As a result, existing Riemannian optimization theory neither identifies a canonical vector field to use in first-order schemes nor guarantees their convergence in this setting. To address this, we introduce notions of convexity, monotonicity, and Lipschitz continuity induced by a connection different from the Levi-Civita connection, namely the recently proposed iso-connection. Within this iso-Riemannian framework, we propose an iso-Riemannian descent algorithm and provide a detailed convergence analysis. We then show, for several downstream tasks - including iso-Riemannian barycentre computation and the optimization of Euclidean convex functions over learned data manifolds - that iso-convexity, iso-monotonicity, and iso-Lipschitz continuity form the right set of assumptions to reconcile learned geometry with Euclidean convexity. Experiments on synthetic and real datasets, including MNIST, endowed with a learned pullback structure, demonstrate that our approach yields interpretable barycentres, improved clustering, and provably efficient solutions to inverse problems, even in high-dimensional settings. Taken together, these results show that iso-Riemannian optimization provides a natural geometric framework for designing and analyzing algorithms on learned data manifolds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper introduces an iso-Riemannian optimization framework on learned data manifolds by using the iso-connection (distinct from Levi-Civita) to define iso-convexity, iso-monotonicity, and iso-Lipschitz continuity. It proposes an iso-Riemannian descent algorithm, provides a convergence analysis, and applies the framework to iso-Riemannian barycentre computation and optimization of Euclidean convex functions over learned pullback manifolds, with supporting experiments on synthetic data and real datasets including MNIST.

Significance. If the central claims hold, the work supplies a theoretically grounded method for first-order optimization on data-learned Riemannian structures that reconciles with Euclidean convexity where standard geodesic convexity and monotonicity fail. The explicit convergence guarantees and downstream applications to barycentres, clustering, and inverse problems represent a concrete advance for manifold optimization in machine learning.

major comments (1)
  1. [§4] §4, Algorithm 1 and Theorem 4.3: the convergence proof proceeds from the iso-monotonicity inequality in the standard descent manner, but the manuscript should include an explicit statement (or counter-example) showing that the same Euclidean convex function yields a non-monotone vector field under the Levi-Civita connection on the learned manifold; without this, the necessity of switching to the iso-connection remains implicit rather than demonstrated.
minor comments (3)
  1. [§2] Notation for the pullback metric and iso-connection should be introduced with a short table or diagram in §2 to avoid repeated cross-references when the iso-Riemannian gradient is first used in §3.
  2. [Experiments] Figure 4 (MNIST barycentre examples): the caption does not state the dimension of the learned latent space or the number of samples used to fit the manifold; this information is needed to interpret the visual results.
  3. [§5] The statement that iso-convexity 'reconciles learned geometry with Euclidean convexity' (abstract and §5) would be strengthened by a short remark on whether the iso-convexity constant reduces to the Euclidean one when the manifold is flat.

Simulated Author's Rebuttal

1 responses · 0 unresolved

Thank you for the referee's positive assessment and constructive comment on our manuscript. We address the major comment below and will revise the paper to incorporate the suggested clarification.

read point-by-point responses
  1. Referee: [§4] §4, Algorithm 1 and Theorem 4.3: the convergence proof proceeds from the iso-monotonicity inequality in the standard descent manner, but the manuscript should include an explicit statement (or counter-example) showing that the same Euclidean convex function yields a non-monotone vector field under the Levi-Civita connection on the learned manifold; without this, the necessity of switching to the iso-connection remains implicit rather than demonstrated.

    Authors: We thank the referee for this observation. The manuscript already states that Euclidean convex functions cannot be assumed to be geodesically convex and that the associated Riemannian gradient fields are generally not monotone in the classical Riemannian sense. However, we agree that an explicit counter-example would make the necessity of the iso-connection more concrete rather than implicit. In the revised manuscript, we will add a concise counter-example in Section 4 (immediately preceding Algorithm 1) that constructs a simple learned manifold and a Euclidean convex function for which the Levi-Civita gradient field violates monotonicity, while the corresponding iso-gradient field satisfies iso-monotonicity. This addition will directly demonstrate why the switch to the iso-connection is required for the convergence analysis to hold in the learned-manifold setting. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper defines iso-convexity, iso-monotonicity and iso-Lipschitz continuity via the iso-connection on the pullback manifold, then derives convergence of the iso-Riemannian descent scheme directly from the iso-monotonicity inequality. Barycentre and inverse-problem results follow as corollaries. No equation or claim reduces by construction to a fitted parameter, self-referential definition, or unverified self-citation chain; the central argument remains independent of the present paper's own inputs and is externally grounded in the properties of the iso-connection.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the existence and properties of the iso-connection (recently proposed elsewhere), the assumption that the learned manifold admits a pullback Riemannian structure, and the claim that iso-convexity is the appropriate notion for Euclidean convex objectives. No explicit free parameters or new invented entities are described in the abstract.

axioms (2)
  • domain assumption The iso-connection exists and induces well-defined notions of convexity, monotonicity and Lipschitz continuity on the learned manifold.
    Invoked to replace the Levi-Civita connection when standard Riemannian convexity fails.
  • domain assumption Euclidean convex functions on the ambient space remain iso-convex when restricted to the learned manifold.
    This is the key reconciliation assumption stated in the abstract.

pith-pipeline@v0.9.0 · 5804 in / 1368 out tokens · 25287 ms · 2026-05-18T04:01:07.190973+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 1 internal anchor

  1. [1]

    Fast and accurate optimization on the orthogonal manifold without retraction

    Pierre Ablin and Gabriel Peyr ´e. Fast and accurate optimization on the orthogonal manifold without retraction. InInternational Conference on Artificial Intelligence and Statistics, pages 5636–5657. PMLR, 2022

  2. [2]

    Infeasible deterministic, stochastic, and variance- reduction algorithms for optimization under orthogonality constraints.Journal of Machine Learning Research, 25(389):1–38, 2024

    Pierre Ablin, Simon Vary, Bin Gao, and Pierre-Antoine Absil. Infeasible deterministic, stochastic, and variance- reduction algorithms for optimization under orthogonality constraints.Journal of Machine Learning Research, 25(389):1–38, 2024

  3. [3]

    Trust-region methods on riemannian manifolds.Founda- tions of Computational Mathematics, 7(3):303–330, 2007

    P-A Absil, Christopher G Baker, and Kyle A Gallivan. Trust-region methods on riemannian manifolds.Founda- tions of Computational Mathematics, 7(3):303–330, 2007

  4. [4]

    Princeton University Press, 2008

    P-A Absil, Robert Mahony, and Rodolphe Sepulchre.Optimization algorithms on matrix manifolds. Princeton University Press, 2008

  5. [5]

    Riemannian levenberg-marquardt method with global and local convergence properties.arXiv preprint arXiv:2210.00253, 2022

    Sho Adachi, Takayuki Okuno, and Akiko Takeda. Riemannian levenberg-marquardt method with global and local convergence properties.arXiv preprint arXiv:2210.00253, 2022

  6. [6]

    Adaptive regularization with cubics on manifolds.Mathematical Programming, 188(1):85–134, 2021

    Naman Agarwal, Nicolas Boumal, Brian Bullins, and Coralia Cartis. Adaptive regularization with cubics on manifolds.Mathematical Programming, 188(1):85–134, 2021

  7. [7]

    Solving inverse problems using data- driven models.Acta Numerica, 28:1–174, 2019

    Simon Arridge, Peter Maass, Ozan ¨Oktem, and Carola-Bibiane Sch¨onlieb. Solving inverse problems using data- driven models.Acta Numerica, 28:1–174, 2019

  8. [8]

    A locally adaptive normal distribution.Advances in Neural Information Processing Systems, 29, 2016

    Georgios Arvanitidis, Lars K Hansen, and Søren Hauberg. A locally adaptive normal distribution.Advances in Neural Information Processing Systems, 29, 2016

  9. [9]

    Computing medians and means in hadamard spaces.SIAM journal on optimization, 24(3):1542– 1566, 2014

    Miroslav Bac ´ak. Computing medians and means in hadamard spaces.SIAM journal on optimization, 24(3):1542– 1566, 2014

  10. [10]

    The difference of convex algorithm on hadamard manifolds.Journal of Optimization Theory and Applications, 201(1):221–251, 2024

    Ronny Bergmann, Orizon P Ferreira, Elianderson M Santos, and Jo ˜ao Carlos O Souza. The difference of convex algorithm on hadamard manifolds.Journal of Optimization Theory and Applications, 201(1):221–251, 2024

  11. [11]

    The riemannian convex bundle method.arXiv preprint arXiv:2402.13670, 2024

    Ronny Bergmann, Roland Herzog, and Hajg Jasa. The riemannian convex bundle method.arXiv preprint arXiv:2402.13670, 2024

  12. [12]

    Fenchel duality theory and a primal-dual algorithm on riemannian manifolds.Foundations of Computational Mathemat- ics, 21(6):1465–1504, 2021

    Ronny Bergmann, Roland Herzog, Maur ´ıcio Silva Louzeiro, Daniel Tenbrinck, and Jos´e Vidal-N´u˜nez. Fenchel duality theory and a primal-dual algorithm on riemannian manifolds.Foundations of Computational Mathemat- ics, 21(6):1465–1504, 2021

  13. [13]

    Ronny Bergmann, Johannes Persch, and Gabriele Steidl. A parallel douglas–rachford algorithm for minimiz- ing rof-like functionals on images with values in symmetric hadamard manifolds.SIAM Journal on Imaging Sciences, 9(3):901–937, 2016

  14. [14]

    William M Boothby.An introduction to differentiable manifolds and Riemannian geometry, Revised, volume

  15. [15]

    Gulf Professional Publishing, 2003

  16. [16]

    Cambridge University Press, 2023

    Nicolas Boumal.An introduction to optimization on smooth manifolds. Cambridge University Press, 2023

  17. [17]

    Birkh ¨auser, 1992

    Manfredo Perdigao do Carmo.Riemannian geometry. Birkh ¨auser, 1992. 24

  18. [18]

    Learning the subspace of variation for global optimization of functions with low effective dimension.arXiv preprint arXiv:2401.17825, 2024

    Coralia Cartis, Xinzhu Liang, Estelle Massart, and Adilet Otemissov. Learning the subspace of variation for global optimization of functions with low effective dimension.arXiv preprint arXiv:2401.17825, 2024

  19. [19]

    Low-rank gradient descent.IEEE Open Journal of Control Systems, 2:380–395, 2023

    Romain Cosson, Ali Jadbabaie, Anuran Makur, Amirhossein Reisizadeh, and Devavrat Shah. Low-rank gradient descent.IEEE Open Journal of Control Systems, 2:380–395, 2023

  20. [20]

    Intermediate layer optimization for inverse problems using deep generative models

    Giannis Daras, Joseph Dean, Ajil Jalal, and Alex Dimakis. Intermediate layer optimization for inverse problems using deep generative models. InInternational Conference on Machine Learning, pages 2421–2432. PMLR, 2021

  21. [21]

    Pulling back symmetric riemannian geometry for data analysis.arXiv preprint arXiv:2403.06612, 2024

    Willem Diepeveen. Pulling back symmetric riemannian geometry for data analysis.arXiv preprint arXiv:2403.06612, 2024

  22. [22]

    Score-based pullback riemannian geometry: Extracting the data manifold geometry using anisotropic flows

    Willem Diepeveen, Georgios Batzolis, Zakhar Shumaylov, and Carola-Bibiane Sch¨onlieb. Score-based pullback riemannian geometry: Extracting the data manifold geometry using anisotropic flows. InForty-second Interna- tional Conference on Machine Learning, 2025

  23. [23]

    An inexact semismooth newton method on riemannian manifolds with application to duality-based total variation denoising.SIAM Journal on Imaging Sciences, 14(4):1565–1600, 2021

    Willem Diepeveen and Jan Lellmann. An inexact semismooth newton method on riemannian manifolds with application to duality-based total variation denoising.SIAM Journal on Imaging Sciences, 14(4):1565–1600, 2021

  24. [24]

    Manifold learning with normalizing flows: Towards regularity, expres- sivity and iso-riemannian geometry.arXiv preprint arXiv:2505.08087, 2025

    Willem Diepeveen and Deanna Needell. Manifold learning with normalizing flows: Towards regularity, expres- sivity and iso-riemannian geometry.arXiv preprint arXiv:2505.08087, 2025

  25. [25]

    NICE: Non-linear Independent Components Estimation

    Laurent Dinh, David Krueger, and Yoshua Bengio. Nice: Non-linear independent components estimation.arXiv preprint arXiv:1410.8516, 2014

  26. [26]

    Testing the manifold hypothesis.Journal of the American Mathematical Society, 29(4):983–1049, 2016

    Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan. Testing the manifold hypothesis.Journal of the American Mathematical Society, 29(4):983–1049, 2016

  27. [27]

    Proximal point algorithm on riemannian manifolds.Optimization, 51(2):257–270, 2002

    OP Ferreira and PR Oliveira. Proximal point algorithm on riemannian manifolds.Optimization, 51(2):257–270, 2002

  28. [28]

    Subgradient algorithm on riemannian manifolds.Journal of Optimization Theory and Applications, 97:93–104, 1998

    OP Ferreira and PR1622188 Oliveira. Subgradient algorithm on riemannian manifolds.Journal of Optimization Theory and Applications, 97:93–104, 1998

  29. [29]

    Geometric design of the tangent term in landing algorithms for orthogonality constraints.arXiv preprint arXiv:2507.15638, 2025

    Florentin Goyens, P-A Absil, and Florian Feppon. Geometric design of the tangent term in landing algorithms for orthogonality constraints.arXiv preprint arXiv:2507.15638, 2025

  30. [30]

    Riemannian metric learning: Closer to you than you imagine.arXiv preprint arXiv:2503.05321, 2025

    Samuel Gruffaz and Josua Sassen. Riemannian metric learning: Closer to you than you imagine.arXiv preprint arXiv:2503.05321, 2025

  31. [31]

    Phase retrieval under a generative prior.Advances in Neural Information Processing Systems, 31, 2018

    Paul Hand, Oscar Leong, and Vlad V oroninski. Phase retrieval under a generative prior.Advances in Neural Information Processing Systems, 31, 2018

  32. [32]

    Global guarantees for enforcing deep generative priors by empirical risk

    Paul Hand and Vladislav V oroninski. Global guarantees for enforcing deep generative priors by empirical risk. InConference On Learning Theory, pages 970–978. PMLR, 2018

  33. [33]

    A geometric take on metric learning.Advances in Neural Information Processing Systems, 25, 2012

    Søren Hauberg, Oren Freifeld, and Michael Black. A geometric take on metric learning.Advances in Neural Information Processing Systems, 25, 2012

  34. [34]

    A proximal bundle algorithm for nonsmooth optimization on riemannian manifolds.IMA Journal of Numerical Analysis, 43(1):293–325, 2023

    Najmeh Hoseini Monjezi, Soghra Nobakhtian, and Mohamad Reza Pouryayevali. A proximal bundle algorithm for nonsmooth optimization on riemannian manifolds.IMA Journal of Numerical Analysis, 43(1):293–325, 2023

  35. [35]

    A riemannian bfgs method without differentiated retraction for nonconvex optimization problems.SIAM Journal on Optimization, 28(1):470–495, 2018

    Wen Huang, P-A Absil, and Kyle A Gallivan. A riemannian bfgs method without differentiated retraction for nonconvex optimization problems.SIAM Journal on Optimization, 28(1):470–495, 2018

  36. [36]

    A broyden class of quasi-newton methods for riemannian opti- mization.SIAM Journal on Optimization, 25(3):1660–1685, 2015

    Wen Huang, Kyle A Gallivan, and P-A Absil. A broyden class of quasi-newton methods for riemannian opti- mization.SIAM Journal on Optimization, 25(3):1660–1685, 2015

  37. [37]

    Riemannian center of mass and mollifier smoothing.Communications on pure and applied mathematics, 30(5):509–541, 1977

    Hermann Karcher. Riemannian center of mass and mollifier smoothing.Communications on pure and applied mathematics, 30(5):509–541, 1977

  38. [38]

    Deep metric learning: A survey.Symmetry, 11(9):1066, 2019

    Mahmut Kaya and Hasan S ¸akir Bilge. Deep metric learning: A survey.Symmetry, 11(9):1066, 2019

  39. [39]

    Glow: Generative flow with invertible 1x1 convolutions.Advances in neural information processing systems, 31, 2018

    Durk P Kingma and Prafulla Dhariwal. Glow: Generative flow with invertible 1x1 convolutions.Advances in neural information processing systems, 31, 2018

  40. [40]

    Riemannian interior point methods for constrained optimization on manifolds

    Zhijian Lai and Akiko Yoshise. Riemannian interior point methods for constrained optimization on manifolds. Journal of Optimization Theory and Applications, 201(1):433–469, 2024

  41. [41]

    Smooth manifolds

    John M Lee. Smooth manifolds. InIntroduction to Smooth Manifolds, pages 1–31. Springer, 2013

  42. [42]

    Inverting deep generative models, one layer at a time.Advances in neural information processing systems, 32, 2019

    Qi Lei, Ajil Jalal, Inderjit S Dhillon, and Alexandros G Dimakis. Inverting deep generative models, one layer at a time.Advances in neural information processing systems, 32, 2019. 25

  43. [43]

    Optimal regularization for a data source.Foundations of Computational Mathematics, pages 1–50, 2025

    Oscar Leong, Eliza O’Reilly, Yong Sheng Soh, and Venkat Chandrasekaran. Optimal regularization for a data source.Foundations of Computational Mathematics, pages 1–50, 2025

  44. [44]

    Simple algorithms for optimization on riemannian manifolds with con- straints.Applied Mathematics & Optimization, 82(3):949–981, 2020

    Changshuo Liu and Nicolas Boumal. Simple algorithms for optimization on riemannian manifolds with con- straints.Applied Mathematics & Optimization, 82(3):949–981, 2020

  45. [45]

    The gradient projection method along geodesics.Management Science, 18(11):620–631, 1972

    David G Luenberger. The gradient projection method along geodesics.Management Science, 18(11):620–631, 1972

  46. [46]

    Improved learning of riemannian metrics for exploratory analysis.Neural Networks, 17(8-9):1087–1100, 2004

    Jaakko Peltonen, Arto Klami, and Samuel Kaski. Improved learning of riemannian metrics for exploratory analysis.Neural Networks, 17(8-9):1087–1100, 2004

  47. [47]

    Manifold learning and optimization using tangent space proxies.arXiv preprint arXiv:2501.12678, 2025

    Ryan A Robinett, Lorenzo Orecchia, and Samantha J Riesenfeld. Manifold learning and optimization using tangent space proxies.arXiv preprint arXiv:2501.12678, 2025

  48. [48]

    American Mathematical Soc., 1996

    Takashi Sakai.Riemannian geometry, volume 149. American Mathematical Soc., 1996

  49. [49]

    Riemannian metric learning via optimal transport

    Christopher Scarvelis and Justin Solomon. Riemannian metric learning via optimal transport. InThe Eleventh International Conference on Learning Representations, 2023

  50. [50]

    Manifold free riemannian optimization.arXiv preprint arXiv:2209.03269, 2022

    Boris Shustin, Haim Avron, and Barak Sober. Manifold free riemannian optimization.arXiv preprint arXiv:2209.03269, 2022

  51. [51]

    Optimization techniques on riemannian manifolds.Fields Institute Communications, 3, 1994

    Steven T Smith. Optimization techniques on riemannian manifolds.Fields Institute Communications, 3, 1994

  52. [52]

    Manifold approximation by moving least-squares projection (mmls).Constructive Approximation, 52(3):433–478, 2020

    Barak Sober and David Levin. Manifold approximation by moving least-squares projection (mmls).Constructive Approximation, 52(3):433–478, 2020

  53. [53]

    Learning distances from data with normalizing flows and score matching

    Peter Sorrenson, Daniel Behrend-Uriarte, Christoph Schnoerr, and Ullrich Koethe. Learning distances from data with normalizing flows and score matching. InForty-second International Conference on Machine Learning, 2025

  54. [54]

    A proximal point algorithm for dc fuctions on hadamard manifolds.Journal of Global Optimization, 63:797–810, 2015

    JCO Souza and PR Oliveira. A proximal point algorithm for dc fuctions on hadamard manifolds.Journal of Global Optimization, 63:797–810, 2015

  55. [55]

    Duluxan Sritharan, Shu Wang, and Sahand Hormoz. Computing the riemannian curvature of image patch and single-cell rna sequencing data manifolds using extrinsic differential geometry.Proceedings of the national academy of sciences, 118(29):e2100473118, 2021

  56. [56]

    Xingzhi Sun, Danqi Liao, Kincaid MacDonald, Yanlei Zhang, Guillaume Huguet, Guy Wolf, Ian Adelstein, Tim G. J. Rudner, and Smita Krishnaswamy. Geometry-aware autoencoders for metric learning and generative modeling on data manifolds. InICML 2024 Workshop on Geometry-grounded Representation Learning and Generative Modeling, 2024

  57. [57]

    Springer Science & Business Media, 1994

    Constantin Udriste.Convex Functions and Optimization Methods on Riemannian Manifolds, volume 297. Springer Science & Business Media, 1994

  58. [58]

    Optimization without retraction on the random generalized stiefel manifold.arXiv preprint arXiv:2405.01702, 2024

    Simon Vary, Pierre Ablin, Bin Gao, and P-A Absil. Optimization without retraction on the random generalized stiefel manifold.arXiv preprint arXiv:2405.01702, 2024

  59. [59]

    Monotone and accretive vector fields on riemannian manifolds.Journal of optimization theory and applications, 146(3):691–708, 2010

    JH Wang, G L´opez, Victoria Mart´ın-M´arquez, and Chong Li. Monotone and accretive vector fields on riemannian manifolds.Journal of optimization theory and applications, 146(3):691–708, 2010

  60. [60]

    Geometric machine learning.AI Magazine, 46(1), 2025

    Melanie Weber. Geometric machine learning.AI Magazine, 46(1), 2025

  61. [61]

    Projection-free nonconvex stochastic optimization on riemannian manifolds.IMA Journal of Numerical Analysis, 42(4):3241–3271, 2021

    Melanie Weber and Suvrit Sra. Projection-free nonconvex stochastic optimization on riemannian manifolds.IMA Journal of Numerical Analysis, 42(4):3241–3271, 2021

  62. [62]

    Riemannian optimization via frank-wolfe methods.Mathematical Programming, 199(1):525–556, 2023

    Melanie Weber and Suvrit Sra. Riemannian optimization via frank-wolfe methods.Mathematical Programming, 199(1):525–556, 2023

  63. [63]

    Distance metric learning for large margin nearest neighbor classifi- cation.Journal of machine learning research, 10(2), 2009

    Kilian Q Weinberger and Lawrence K Saul. Distance metric learning for large margin nearest neighbor classifi- cation.Journal of machine learning research, 10(2), 2009. 26 A Supplementary numerical results to Section 4 A.1 The river and spiral pullback geometries The river diffeomorphismThe river diffeomorphismφ river :R 2 →R 2 is defined as φriver(x) := (...