pith. machine review for the scientific record. sign in

arxiv: 2605.05569 · v2 · submitted 2026-05-07 · 🧮 math.OC · cs.LG

Recognition: 2 theorem links

· Lean Theorem

Stability of the Monge Map in Semi-Dual Optimal Transport

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:44 UTC · model grok-4.3

classification 🧮 math.OC cs.LG
keywords optimal transportMonge mapsemi-dual formulationsaddle-point structureconvergence conditionsconstrained optimization
0
0 comments X

The pith

The semi-dual optimal transport formulation allows Monge maps to converge under conditions that do not require the dual potential to be optimal.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines the semi-dual version of the optimal transport problem and shows it possesses a degenerate saddle-point structure. That structure makes its numerical solution equivalent to solving a constrained optimization problem. The authors then derive necessary and sufficient conditions under which the Monge map converges even when the dual potential has not reached optimality. These conditions clarify why practical algorithms typically need more iterations to stabilize the transport map than to stabilize the potential.

Core claim

The semi-dual formulation of the optimal transport problem has a degenerate saddle-point structure, and its numerical solution is equivalent to solving a constrained optimization problem. Necessary and sufficient conditions are derived for the convergence of Monge maps without requiring optimality of the dual potential.

What carries the argument

The degenerate saddle-point structure of the semi-dual formulation, which reduces numerical solution to a constrained optimization problem and separates convergence of the Monge map from optimality of the dual potential.

If this is right

  • Numerical algorithms for semi-dual optimal transport are equivalent to constrained optimization problems.
  • Monge map convergence can occur independently of dual potential optimality under the derived conditions.
  • Algorithms require more iterations to update the transport map than the potential because of the degenerate saddle-point structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Algorithms could be redesigned to update the map and potential on separate schedules once the conditions are checked.
  • The same separation of convergence rates may appear in other saddle-point formulations of transport problems with similar degeneracy.
  • Numerical tests on standard benchmark costs could verify whether the necessary and sufficient conditions hold in practice.

Load-bearing premise

The cost function and marginal measures satisfy the standard regularity assumptions used in optimal transport.

What would settle it

A concrete numerical example in which the Monge map fails to converge when the stated necessary and sufficient conditions hold, or converges when those conditions are violated.

Figures

Figures reproduced from arXiv: 2605.05569 by Anton Selitskiy, David Millard.

Figure 1
Figure 1. Figure 1: Convergence of transport map with divergence of potential. view at source ↗
Figure 2
Figure 2. Figure 2: ∥t − T ⋆∥ 2 L2(µ) and ∥∇ψ − ∇ψ ⋆∥ 2 L2(ν) with respect to K and ηψ/ηt. This viewpoint also explains the empirical observation of Makkuva et al. [2020] that unconstrained neural parameterizations of the transport map (e.g., ∇v in view at source ↗
Figure 5
Figure 5. Figure 5: E.4 OTM OTM uses the same max-correlation objective, but adds the published gradient-optimality penalty from optimal transport modeling [Rout et al., 2021]. In our setup this penalty has weight 0.1. The results are shown in view at source ↗
Figure 3
Figure 3. Figure 3: Convergence behavior of the transport map and potential for the OTP method. view at source ↗
Figure 4
Figure 4. Figure 4: Convergence behavior of the transport map and potential for the Monge Map method. view at source ↗
Figure 5
Figure 5. Figure 5: Convergence behavior of the transport map and potential for the Max Correlation method. view at source ↗
Figure 6
Figure 6. Figure 6: Convergence behavior of the transport map and potential for the OTM method. view at source ↗
read the original abstract

This paper shows that the semi-dual formulation of the optimal transport problem has a degenerate saddle-point structure, and that its numerical solution is equivalent to solving a constrained optimization problem. We derive necessary and sufficient conditions for the convergence of Monge maps without requiring optimality of the dual potential. This analysis helps explain why, in practice, numerical algorithms often require more iterations to update the transport map than the potential.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. This paper shows that the semi-dual formulation of the optimal transport problem has a degenerate saddle-point structure, and that its numerical solution is equivalent to solving a constrained optimization problem. We derive necessary and sufficient conditions for the convergence of Monge maps without requiring optimality of the dual potential. This analysis helps explain why, in practice, numerical algorithms often require more iterations to update the transport map than the potential.

Significance. If the derived conditions hold under the stated regularity assumptions on the cost and marginals, the work could offer a useful theoretical lens on stability and convergence rates in semi-dual OT algorithms. The link between the degenerate saddle-point structure and the observed disparity in iteration counts for the map versus the potential is a concrete practical insight that may inform algorithm design. The manuscript does not mention machine-checked proofs or reproducible code, so those strengths are not credited here.

major comments (2)
  1. Abstract: The abstract asserts a derivation of necessary and sufficient conditions for Monge-map convergence, but supplies no proof outline, no statement of assumptions, and no verification steps; therefore the support for the central claim cannot be assessed.
  2. The equivalence between the numerical solution of the semi-dual problem and the constrained optimization problem is presented as following from the degenerate saddle-point structure, but the precise mechanism by which degeneracy is exploited (and any additional technical conditions required) is not visible in the abstract and must be load-bearing for the main result.
minor comments (1)
  1. The abstract could be expanded to list the key regularity assumptions on the cost function and marginal measures that underpin the derivation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback on the abstract and the presentation of our main results. We address each major comment below and will revise the manuscript accordingly to improve clarity while preserving the paper's focus.

read point-by-point responses
  1. Referee: Abstract: The abstract asserts a derivation of necessary and sufficient conditions for Monge-map convergence, but supplies no proof outline, no statement of assumptions, and no verification steps; therefore the support for the central claim cannot be assessed.

    Authors: We agree that the abstract, as a concise summary, omits a proof outline and explicit assumptions. The full manuscript states the necessary and sufficient conditions in Theorem 3.2 under the assumptions of a C^2 strictly convex cost satisfying the twist condition and positive continuous densities for the marginals. The derivation proceeds by linearizing the semi-dual optimality conditions around the degenerate saddle point and identifying the kernel of the Hessian with respect to the map variables. We will revise the abstract to include a brief outline: 'Under standard regularity assumptions on the cost and marginals, we derive necessary and sufficient conditions for Monge map convergence by analyzing the degenerate saddle-point structure, without requiring dual potential optimality.' This revision will allow readers to better assess the central claim. revision: yes

  2. Referee: The equivalence between the numerical solution of the semi-dual problem and the constrained optimization problem is presented as following from the degenerate saddle-point structure, but the precise mechanism by which degeneracy is exploited (and any additional technical conditions required) is not visible in the abstract and must be load-bearing for the main result.

    Authors: The equivalence is established in Theorem 2.1 by showing that the semi-dual objective's saddle-point Hessian is singular in the directions of the transport map (which is the gradient of the potential), allowing reduction to a constrained optimization problem over the potential alone. This exploits the fact that variations in the map are constrained by the Monge relation, and holds under the twist condition on the cost (already stated in Section 2). We agree the abstract does not make the mechanism visible and will add a clarifying phrase: 'by exploiting the degeneracy of the saddle-point Hessian to establish equivalence to a constrained optimization problem.' No further technical conditions are required beyond those in the main text. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained

full rationale

The paper derives necessary and sufficient conditions for Monge-map convergence from the degenerate saddle-point structure of the semi-dual formulation and its equivalence to a constrained optimization problem. These steps rely on standard regularity assumptions for costs and measures rather than any fitted parameters, self-definitions, or load-bearing self-citations. No equation or claim reduces to its own inputs by construction, and the central results are presented as consequences of the problem structure without circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5350 in / 944 out tokens · 33839 ms · 2026-05-12T01:44:32.852982+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

79 extracted references · 79 canonical work pages · 1 internal anchor

  1. [1]

    , title =

    Monge, G. , title =. Histoire de l’Acad\'emie Royale des Sciences avec les M\'emoires de Math\'ematique & de Physique, Paris , year =

  2. [2]

    , title =

    Kantorovich, Leonid V. , title =. Dokl. Akad. Nauk SSSR , year =

  3. [3]

    Kantorovich, L. V. and Rubinshtein G. Sh. , title =. Dokl. Akad. Nauk SSSR , year =

  4. [4]

    Brenier, Yann , title =. Comm. Pure Appl. Math. , year =

  5. [5]

    and Brenier, Y

    Benamou, J.-D. and Brenier, Y. , title =. Numer. Math. , year =

  6. [6]

    Duke Mathematical Journal , year =

    McCann, Robert , title =. Duke Mathematical Journal , year =

  7. [7]

    Pavliotis, G. A. , title =

  8. [8]

    Lipman, Yaron and Chen, Ricky T. Q. and Ben-Hamu, Heli and Nickel, Maximilian and Le Matt , title =. https://arxiv.org/abs/2210.02747 , year =

  9. [9]

    Tyrrell and Wets Roger J.-B

    Rockafellar, R. Tyrrell and Wets Roger J.-B. , title =

  10. [10]

    , title =

    Santambrogio, F. , title =

  11. [11]

    , title =

    Chen, Yongxin and Georgiou, Tryphon T. , title =. SIAM Review , year =

  12. [12]

    Vershik, A. M. , title =. Russian Mathematical Surveys , volume =

  13. [13]

    Kantorovich, L. V. , title =. Uspekhi Matematicheskikh Nauk , volume =

  14. [14]

    Bogachev, V. I. and R\". Kolmogorov Problems on Equations for Stationary and Transition Probabilities of Diffusion Processes , journal =. 2023 , doi =

  15. [15]

    Annales de l'Institut Henri Poincaré (C) Analyse Non Linéaire , volume =

    Pratelli, Aldo , title =. Annales de l'Institut Henri Poincaré (C) Analyse Non Linéaire , volume =

  16. [16]

    Bogachev, V. I. and Kalinin, A. N. and Popova, S. N. , title =. Journal of Mathematical Sciences , volume =. 2019 , note =

  17. [17]

    and Pratelli, A

    Ambrosio, L. and Pratelli, A. , title =. Optimal Transportation and Applications , series =. 2003 , pages =

  18. [18]

    , title =

    Vaserstein, Leonid N. , title =. Problemy Peredachi Informatsii , volume =

  19. [19]

    Dobrushin, R. L. , title =. Theory of Probability and Its Applications , volume =

  20. [20]

    Fr\'echet, M , title =. C. R. Acad. Sci. Paris , volume =

  21. [21]

    Wasserstein Generative Adversarial Networks , booktitle =

    Mart. Wasserstein Generative Adversarial Networks , booktitle =. 2017 , publisher =

  22. [22]

    Improved Training of Wasserstein GANs , booktitle =

    Ishaan Gulrajani and Faruk Ahmed and Mart. Improved Training of Wasserstein GANs , booktitle =. 2017 , url =

  23. [23]

    Brian D. O. Anderson , title =. Stochastic Processes and their Applications , volume =. 1982 , doi =

  24. [24]

    Tyrrell and Wets, Roger J.-B

    Rockafellar, R. Tyrrell and Wets, Roger J.-B. , title =. Pacific Journal of Mathematics , volume =

  25. [25]

    arXiv preprint arXiv:2110.02999 , year =

    Generative Modeling with Optimal Transport Maps , author =. arXiv preprint arXiv:2110.02999 , year =

  26. [26]

    Calculus of Variations and Partial Differential Equations , volume =

    Existence, Duality, and Cyclical Monotonicity for Weak Transport Costs , author =. Calculus of Variations and Partial Differential Equations , volume =. 2019 , doi =

  27. [27]

    Tyrrell , title =

    Rockafellar, R. Tyrrell , title =. Nonlinear Operators and the Calculus of Variations , editor =. 1976 , series =

  28. [28]

    2002 , series =

    Kallenberg, Olav , title =. 2002 , series =

  29. [29]

    Advances in Neural Information Processing Systems , volume=

    Sinkhorn Distances: Lightspeed Computation of Optimal Transport , author=. Advances in Neural Information Processing Systems , volume=

  30. [30]

    The American Mathematical Monthly , volume =

    Sinkhorn, Richard , title =. The American Mathematical Monthly , volume =

  31. [31]

    Interspeech , year=

    Training-Free Voice Conversion with Factorized Optimal Transport , author=. Interspeech , year=

  32. [32]

    IEEE TPAMI , year=

    Optimal Transport for Domain Adaptation , author=. IEEE TPAMI , year=

  33. [33]

    ICASSP , year=

    X-vectors: Robust DNN embeddings for speaker recognition , author=. ICASSP , year=

  34. [34]

    Advances in Neural Information Processing Systems , volume=

    HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis , author=. Advances in Neural Information Processing Systems , volume=

  35. [35]

    International Conference on Learning Representations (ICLR) , year=

    DiffWave: A Versatile Diffusion Model for Audio Synthesis , author=. International Conference on Learning Representations (ICLR) , year=

  36. [36]

    Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) , pages=

    WaveFM: A High-Fidelity and Efficient Vocoder Based on Flow Matching , author=. Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) , pages=

  37. [37]

    2020 , journal=

    Denoising Diffusion Probabilistic Models , author=. 2020 , journal=

  38. [38]

    ICLR , year=

    Score-Based Generative Modeling through Stochastic Differential Equations , author=. ICLR , year=

  39. [39]

    Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physikalisch-mathematische Klasse , year =

    Schr. Sitzungsberichte der Preussischen Akademie der Wissenschaften, Physikalisch-mathematische Klasse , year =

  40. [40]

    Chetrite, Rapha. E. Schr. The European Physical Journal H , volume =

  41. [41]

    Discrete and Continuous Dynamical Systems , year=

    A survey of the Schrödinger problem and some of its connections with optimal transport , author=. Discrete and Continuous Dynamical Systems , year=

  42. [42]
  43. [43]

    Hitchcock, F. L. , title =. Journal of Mathematics and Physics , volume =. 1941 , doi =

  44. [44]

    , title =

    Kantorovich, Leonid V. , title =. 1939 , publisher =

  45. [45]

    , title =

    Kantorovich, Leonid V. , title =. Management Science , volume =

  46. [46]

    Rubinshtein, G. S. , title =. Vestnik Leningrad University. Mathematics, Mechanics, Astronomy , volume =. 1958 , note =

  47. [47]

    , title =

    Dantzig, George B. , title =

  48. [48]

    A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers , author=. Proc. ICLR , year=

  49. [49]

    Optimal Transport

    Villani, C\'. Optimal Transport. Old and New , publisher =. 2009 , OPTkey =

  50. [50]

    Neural Optimal Transport , author=. Proc. ICLR , year=

  51. [51]

    Proceedings of the 37th International Conference on Machine Learning , pages =

    Optimal Transport Mapping via Input Convex Neural Networks , author =. Proceedings of the 37th International Conference on Machine Learning , pages =. 2020 , editor =

  52. [52]

    NeurIPS , year=

    Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark , author=. NeurIPS , year=

  53. [53]

    ICML , year=

    Optimal Transport Mapping via Input Convex Neural Networks , author=. ICML , year=

  54. [54]

    arXiv preprint arXiv:1909.13082 , year=

    Wasserstein-2 Generative Networks , author=. arXiv preprint arXiv:1909.13082 , year=

  55. [55]

    SIAM Review , year=

    Semi-dual Regularized Optimal Transport , author=. SIAM Review , year=

  56. [56]

    arXiv , year=

    A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers , author=. arXiv , year=

  57. [57]

    arXiv:2112.07275 , year=

    Parameter Tuning and Model Selection in Optimal Transport with Semi-dual Brenier Formulation , author=. arXiv:2112.07275 , year=

  58. [58]

    arXiv:2602.03566 , year=

    Riemannian Neural Optimal Transport , author=. arXiv:2602.03566 , year=

  59. [59]

    arXiv , year=

    Three-Player Wasserstein GAN via Amortised Duality , author=. arXiv , year=

  60. [60]

    arXiv , year=

    Wasserstein GAN with Quadratic Transport Cost , author=. arXiv , year=

  61. [61]

    arXiv preprint arXiv:2106.03812 , year =

    Neural Monge Map Estimation and Its Applications , author =. arXiv preprint arXiv:2106.03812 , year =

  62. [62]

    ICLR , year =

    Generative Modeling through the Semi-dual Formulation of Unbalanced Optimal Transport , author =. ICLR , year =

  63. [63]

    & Jalali, A

    2-wasserstein approximation via restricted convex potentials with application to improved training for gans , author =. arXiv preprint:1902.07197 , year =

  64. [64]

    , title =

    Ambrosio, L. , title =. Optimal Transportation and Applications , series =. 2003 , pages =

  65. [65]

    NerISP , year =

    Overcoming Spurious Solutions in Semi-Dual Neural Optimal Transport: A Smoothing Approach for Learning the Optimal Transport Plan , author =. NerISP , year =

  66. [66]

    Proceedings of the Edinburgh Mathematical Society , author=

    On. Proceedings of the Edinburgh Mathematical Society , author=. 2011 , pages=. doi:10.1017/S001309150800117X , number=

  67. [67]

    The. Bull. Amer. Math. Soc. , author=. 2014 , pages=. doi:https://doi.org/10.1090/S0273-0979-2014-01459-4 , number=

  68. [68]

    arXiv preprint arXiv:1902.03642 , year=

    (q,p)-Wasserstein GANs: Comparing Ground Metrics for Wasserstein GANs , author=. arXiv preprint arXiv:1902.03642 , year=

  69. [69]

    Acta Math

    The geometry of optimal transportation , volume=. Acta Math. , author=. 1996 , pages=

  70. [70]

    , title =

    Vershik, Anatoly M. , title =. The Mathematical Intelligencer , volume =. 2013 , doi =

  71. [71]

    Annals of Mathematics , volume =

    von Neumann, John , title =. Annals of Mathematics , volume =

  72. [72]

    GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

    Martin Heusel and Hubert Ramsauer and Thomas Unterthiner and Bernhard Nessler and G. GANs Trained by a Two Time-Scale Update Rule Converge to a Nash Equilibrium , journal =. 2017 , url =. 1706.08500 , timestamp =

  73. [73]

    Minimax estimation of smooth optimal transport maps , journal =

    H\". Minimax estimation of smooth optimal transport maps , journal =

  74. [74]

    and Kolesnikov, Aleksandr V

    Bogachev, Vladimir I. and Kolesnikov, Aleksandr V. , title =. Russian Math. Surveys , year =

  75. [75]

    , title =

    Figalli, Alessio and Kim, Young-Heon and McCann, Robert J. , title =. Archive for Rational Mechanics and Analysis , year =

  76. [76]

    , title =

    Evans, Lawrence C. , title =. Current developments in mathematics. Papers from the conference held in Cambridge, MA, USA, 1997. , year =

  77. [77]

    arXiv preprint arXiv:2201.12220 , year=

    Neural optimal transport , author=. arXiv preprint arXiv:2201.12220 , year=

  78. [78]

    Transactions on Machine Learning Research , pages=

    Neural monge map estimation and its applications , author=. Transactions on Machine Learning Research , pages=

  79. [79]

    arXiv preprint arXiv:2502.01310 , year=

    A Statistical Learning Perspective on Semi-dual Adversarial Neural Optimal Transport Solvers , author=. arXiv preprint arXiv:2502.01310 , year=