pith. sign in

arxiv: 2605.27175 · v1 · pith:V7ZOC2T2new · submitted 2026-05-26 · 🧮 math.OC · math.AP· math.FA

Polyak-Lojasiewicz Inequality for Quadratically Regularized Optimal Transport

Pith reviewed 2026-06-29 15:35 UTC · model grok-4.3

classification 🧮 math.OC math.APmath.FA
keywords optimal transportquadratic regularizationPolyak-Lojasiewicz inequalitylinear convergencedual problemerror boundssemi-discrete transport
0
0 comments X

The pith

The dual of quadratically regularized optimal transport satisfies a local Polyak-Lojasiewicz inequality near the optimum with explicit constants.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Quadratically regularized optimal transport yields sparse couplings and avoids exponential instabilities, yet its dual objective includes a positive part that blocks standard strong-concavity arguments. The paper establishes that near the optimum this positive part behaves sufficiently well to produce a local error bound and Polyak-Lojasiewicz inequality. The results cover both continuous and semi-discrete settings and supply constants that depend only on the cost and marginals. These curvature properties immediately imply linear convergence for gradient ascent, coordinate ascent, and coordinate gradient ascent on the dual, each with an explicit contraction factor.

Core claim

Under mild assumptions covering both continuous and semi-discrete transport problems, the dual QOT objective satisfies a local error bound and a Polyak-Lojasiewicz inequality, with explicit constants depending only on the problem primitives. These results are obtained by functional-analytic techniques exploiting that near the optimum, the argument of the positive part function is positive on the interior of the support of the optimal coupling.

What carries the argument

Local positivity of the argument inside the positive part function on the interior of the optimal coupling support, which supplies the quantitative curvature needed for the error bound and PL inequality.

If this is right

  • Gradient ascent on the dual converges linearly with an explicit contraction rate that depends only on problem primitives.
  • Coordinate ascent on the dual converges linearly with an explicit rate depending only on problem primitives.
  • Coordinate gradient ascent on the dual converges linearly with an explicit rate depending only on problem primitives.
  • The same curvature constants apply uniformly to both continuous and semi-discrete formulations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Quadratic regularization may therefore be algorithmically comparable to entropic regularization in terms of guaranteed linear rates.
  • The explicit constants could be used to select step sizes or stopping tolerances in practical QOT solvers.
  • Analogous positivity arguments might extend the PL property to other non-smooth regularizers in optimal transport.

Load-bearing premise

Near the optimum the argument of the positive part function stays positive on the interior of the support of the optimal coupling.

What would settle it

A concrete QOT instance in which the dual objective violates the stated local error bound or in which gradient ascent on the dual exhibits only sublinear convergence.

read the original abstract

Quadratically regularized optimal transport (QOT) is an alternative to entropic regularization that yields sparse couplings and avoids numerical instabilities due to exponential scaling. From an optimization viewpoint, the dual QOT objective is concave but features a positive part function which prevents strong concavity and reduces smoothness of optimizers. Consequently, standard arguments for linear convergence of algorithms do not apply. In this paper, we nevertheless establish a quantitative curvature property for the QOT dual. Under mild assumptions covering both continuous and semi-discrete transport problems, we prove a local error bound and a Polyak-Lojasiewicz (PL) inequality, with explicit constants depending only on the problem primitives. These results are obtained by functional-analytic techniques exploiting that near the optimum, the argument of the positive part function is positive on the interior of the support of the optimal coupling. As applications, we derive linear convergence of the gradient ascent, coordinate ascent, and coordinate gradient ascent algorithms on the dual problem, with explicit contraction rates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper establishes a local error bound and Polyak-Łojasiewicz inequality for the dual objective of quadratically regularized optimal transport (QOT). Under mild assumptions applicable to both continuous and semi-discrete problems, it derives explicit constants depending only on problem primitives via functional-analytic techniques that exploit positivity of the positive-part argument on the interior support of the optimal coupling near the optimum. This yields linear convergence guarantees (with explicit rates) for gradient ascent, coordinate ascent, and coordinate gradient ascent on the dual.

Significance. If the local positivity property holds under the stated mild assumptions, the result supplies the first explicit curvature analysis for the non-strongly-concave QOT dual, enabling rigorous linear-rate guarantees for first-order methods in a setting where entropic regularization is avoided. The explicit dependence of constants solely on primitives is a clear strength.

major comments (1)
  1. [Abstract] Abstract (and the functional-analytic argument described therein): the derivation of the local error bound and PL inequality with explicit constants rests on the claim that, near the optimum, the argument of the positive-part function is positive on the interior of the support of the optimal coupling. The paper presents this as following from the mild assumptions covering semi-discrete cases, yet atoms or boundary mass in the optimal plan could violate interior positivity; without an explicit verification or additional hypothesis in the main argument, the claimed constants and local curvature do not necessarily follow.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for identifying a point that requires clarification in the justification of the interior positivity property. We address the major comment below and will strengthen the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract (and the functional-analytic argument described therein): the derivation of the local error bound and PL inequality with explicit constants rests on the claim that, near the optimum, the argument of the positive-part function is positive on the interior of the support of the optimal coupling. The paper presents this as following from the mild assumptions covering semi-discrete cases, yet atoms or boundary mass in the optimal plan could violate interior positivity; without an explicit verification or additional hypothesis in the main argument, the claimed constants and local curvature do not necessarily follow.

    Authors: We agree that the current presentation does not contain an explicit verification that the mild assumptions suffice to guarantee the required interior positivity near the optimum, particularly when the optimal plan may involve atoms (as in semi-discrete settings) or boundary mass. While the assumptions are chosen to cover both continuous and semi-discrete problems, an additional lemma or remark making this step rigorous would eliminate any ambiguity in the derivation of the explicit constants. In the revised manuscript we will insert such a verification (based on the problem primitives and the definition of the support) immediately before the main functional-analytic argument, thereby confirming that the local error bound and PL inequality hold with the stated constants under the given hypotheses. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses independent functional analysis under stated assumptions

full rationale

The paper proves a local error bound and PL inequality for the dual QOT objective via functional-analytic techniques that exploit positivity of the positive-part argument near the optimum on the interior support of the optimal coupling. This positivity is presented as following from the mild assumptions covering continuous and semi-discrete problems, not as a self-referential definition or fitted input. No equations reduce by construction to prior results via self-citation chains, uniqueness theorems imported from the same authors, or ansatzes smuggled through citations. The explicit constants depend only on problem primitives, and the derivation chain is self-contained without renaming known empirical patterns or calling fitted quantities predictions. This is the normal case of a proof paper whose central claims rest on external mathematical techniques rather than internal reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities beyond the stated mild assumptions and the local positivity condition on the positive-part argument.

axioms (1)
  • domain assumption Mild assumptions covering continuous and semi-discrete transport problems
    Invoked to guarantee the local positivity property near the optimum.

pith-pipeline@v0.9.1-grok · 5712 in / 1156 out tokens · 27870 ms · 2026-06-29T15:35:13.984034+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Finite-sample bounds for regularized optimal transport

    math.ST 2026-06 unverdicted novelty 7.0

    Establishes explicit finite-sample bias and variance bounds for regularized OT costs that improve prior entropic results, deliver the first quantitative bounds for L^p regularization, and yield an n^{-2/(d+4)} rate fo...

  2. Stability of Quadratically Regularized Optimal Transport

    math.OC 2026-05 unverdicted novelty 7.0

    Establishes L^∞-stability of dual potentials in QOT, yielding local Lipschitz stability of the optimal coupling support in Hausdorff distance for quadratic cost under marginal perturbations.

Reference graph

Works this paper leans on

33 extracted references · 7 canonical work pages · cited by 2 Pith papers · 1 internal anchor

  1. [1]

    Bayraktar, S

    E. Bayraktar, S. Eckstein, and X. Zhang. Stability and sample complexity of divergence regu- larized optimal transport.Bernoulli, 31(1):213–239, 2025

  2. [2]

    Blondel, V

    M. Blondel, V. Seguy, and A. Rolet. Smooth and sparse optimal transport. volume 84 of Proceedings of Machine Learning Research, pages 880–889, 2018

  3. [3]

    J. F. Bonnans and A. Shapiro.Perturbation analysis of optimization problems. Springer Series in Operations Research. Springer-Verlag, New York, 2000

  4. [4]

    Bourgain, H

    J. Bourgain, H. Brezis, and P. Mironescu. Another look at Sobolev spaces. InOptimal Control and Partial Differential Equations, pages 439–455. IOS, Amsterdam, 2001

  5. [5]

    Brezis.Functional analysis, Sobolev spaces and partial differential equations

    H. Brezis.Functional analysis, Sobolev spaces and partial differential equations. Universitext. Springer, New York, 2011

  6. [6]

    G. Carlier. On the linear convergence of the multimarginal Sinkhorn algorithm.SIAM J. Optim., 32(2):786–794, 2022

  7. [7]

    M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors,Advances in Neural Infor- mation Processing Systems, volume 26. Curran Associates, Inc., 2013

  8. [8]

    Eckstein and M

    S. Eckstein and M. Kupper. Computation of optimal transport and related hedging problems via penalization and neural networks.Appl. Math. Optim., 83(2):639–667, 2021

  9. [9]

    Eckstein and M

    S. Eckstein and M. Nutz. Convergence rates for regularized optimal transport via quantization. Math. Oper. Res., 49(2):1223–1240, 2024

  10. [10]

    Essid and J

    M. Essid and J. Solomon. Quadratically regularized optimal transport on graphs.SIAM J. Sci. Comput., 40(4):A1961–A1986, 2018

  11. [11]

    G. B. Folland.Real analysis. Pure and Applied Mathematics. John Wiley & Sons, New York, second edition, 1999

  12. [12]

    Genevay, M

    A. Genevay, M. Cuturi, G. Peyré, and F. Bach. Stochastic optimization for large-scale optimal transport. InAdvances in Neural Information Processing Systems 29, pages 3440–3448, 2016

  13. [13]

    Sample complexity of quadratically regularized optimal transport

    A. González-Sanz, E. del Barrio, and M. Nutz. Sample complexity of quadratically regularized optimal transport.arXiv:2511.09807, 2025. 31

  14. [14]

    González-Sanz, S

    A. González-Sanz, S. Eckstein, and M. Nutz. Sparse regularized optimal transport without curse of dimensionality.arXiv:2505.04721, 2025

  15. [15]

    González-Sanz and M

    A. González-Sanz and M. Nutz. Stability of quadratically regularized optimal transport.In preparation, 2026

  16. [16]

    Linear Convergence of Gradient Descent for Quadratically Regularized Optimal Transport

    A. González-Sanz, M. Nutz, and A. Riveros Valdevenito. Linear convergence of gradient descent for quadratically regularized optimal transport.arXiv:2509.08547, 2025

  17. [17]

    and Nutz, M

    A. González-Sanz and M. Nutz. Sparsity of quadratically regularized optimal transport: Scalar case.SIAM J. Math. Anal., forthcoming, 2024. Preprint arXiv:2410.03353

  18. [18]

    Grisvard.Elliptic Problems in Nonsmooth Domains, volume 24 ofMonographs and Studies in Mathematics

    P. Grisvard.Elliptic Problems in Nonsmooth Domains, volume 24 ofMonographs and Studies in Mathematics. Pitman, Boston, 1985

  19. [19]

    Gulrajani, F

    I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. Courville. Improved training of Wasserstein GANs. InProceedings of the 31st International Conference on Neural Information Processing Systems, pages 5769–5779, 2017

  20. [20]

    L. Li, A. Genevay, M. Yurochkin, and J. Solomon. Continuous regularized Wasserstein barycen- ters. In H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 17755–17765. Curran Associates, Inc., 2020

  21. [21]

    D. A. Lorenz and H. Mahler. Orlicz-space regularization for optimal transport and algorithms for quadratic regularization.arXiv:1909.06082, 2019

  22. [22]

    D. A. Lorenz and H. Mahler. Orlicz space regularization of continuous optimal transport prob- lems.Appl. Math. Optim., 85(2):Paper No. 14, 33, 2022

  23. [23]

    D. A. Lorenz, P. Manns, and C. Meyer. Quadratically regularized optimal transport.Appl. Math. Optim., 83(3):1919–1949, 2021

  24. [24]

    Muzellec, R

    B. Muzellec, R. Nock, G. Patrini, and F. Nielsen. Tsallis regularized optimal transport and ecological inference.Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), 2017

  25. [25]

    M. Nutz. Introduction to entropic optimal transport, 2021. Lecture notes, Columbia University, https://www.math.columbia.edu/~mnutz/docs/EOT_lecture_notes.pdf

  26. [26]

    M. Nutz. Quadratically regularized optimal transport: existence and multiplicity of potentials. SIAM J. Math. Anal., 57(3):2622–2649, 2025

  27. [27]

    Peyré and M

    G. Peyré and M. Cuturi. Computational optimal transport: With applications to data science. Foundations and Trends®in Machine Learning, 11(5–6):355–607, 2019

  28. [28]

    Rüschendorf and W

    L. Rüschendorf and W. Thomsen. Note on the Schrödinger equation andI-projections.Statist. Probab. Lett., 17(5):369–375, 1993

  29. [29]

    Schmitzer

    B. Schmitzer. Stabilized sparse scaling algorithms for entropy regularized transport problems. SIAM J. Sci. Comput., 41(3):A1443–A1481, 2019

  30. [30]

    Seguy, B

    V. Seguy, B. B. Damodaran, R. Flamary, N. Courty, A. Rolet, and M. Blondel. Large scale optimal transport and mapping estimation. InInternational Conference on Learning Represen- tations, 2018

  31. [31]

    A. J. Stromme. Minimum intrinsic dimension scaling for entropic optimal transport. arXiv:2306.03398, 2023

  32. [32]

    Wiesel and X

    J. Wiesel and X. Xu. Sparsity of quadratically regularized optimal transport: Bounds on con- centration and bias.SIAM J. Math. Anal., 57(6):6498–6521, 2025

  33. [33]

    Zhang, G

    S. Zhang, G. Mordant, T. Matsumoto, and G. Schiebinger. Manifold learning with sparse regularised optimal transport.arXiv:2307.09816, 2023. 32