Almost-sharp $O(k^{-1} \log k)$ convergence rate for the Sinkhorn algorithm in the asymptotically scalable case

Guillaume Wang

arxiv: 2604.26265 · v2 · submitted 2026-04-29 · 🧮 math.OC

Almost-sharp O(k⁻¹ log k) convergence rate for the Sinkhorn algorithm in the asymptotically scalable case

Guillaume Wang This is my paper

Pith reviewed 2026-05-15 07:20 UTC · model grok-4.3

classification 🧮 math.OC

keywords Sinkhorn algorithmconvergence rateoptimal transportentropic regularizationasymptotically scalablemarginal errorell1 norm

0 comments

The pith

The Sinkhorn algorithm converges at an O(k^{-1} log k) rate in ell_1 marginal error under the asymptotically scalable condition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that the Sinkhorn algorithm reaches a convergence rate of O(k^{-1} log k) in the ell_1 norm of the marginal error when the problem satisfies the asymptotically scalable case. This improves the previous best upper bound of O(k^{-1/2}) and comes close to the known Omega(k^{-1}) lower bound. The result generalizes earlier work that required strictly positive entries in the transport kernel. A reader would care because this rate indicates how many iterations are needed to achieve a target accuracy in large-scale optimal transport computations.

Core claim

In the asymptotically scalable case, the Sinkhorn algorithm produces iterates whose ell_1-norm marginal error is bounded by O(k^{-1} log k) after k iterations. This bound nearly closes the gap to the Omega(k^{-1}) lower bound and extends the analysis previously available only for the positive case.

What carries the argument

The Sinkhorn iterative scaling procedure, with its ell_1 marginal error tracked under the asymptotically scalable condition on the problem data.

If this is right

The number of iterations needed to reach marginal accuracy epsilon scales like O(epsilon^{-1} log(1/epsilon)).
The rate applies to a larger family of problems than those with strictly positive entries.
Iteration counts in practice can be chosen based on this near-linear dependence on the precision target.
The log k factor is the only remaining gap to the known lower bound.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same rate might guide stopping rules in related scaling algorithms for unbalanced transport.
Simple input diagnostics could be developed to verify the asymptotically scalable condition before running the algorithm.
Speed gains in large-scale applications would follow if the rate is used to set iteration budgets.

Load-bearing premise

The problem must satisfy the asymptotically scalable condition.

What would settle it

A specific asymptotically scalable instance where the marginal error after k steps exceeds any constant times k^{-1} log k for large k.

read the original abstract

We prove that the Sinkhorn algorithm converges at a rate of $O(k^{-1} \log k)$ in $\ell_1$-norm marginal error, in the asymptotically scalable case. This almost closes the gap between the lower bound $\Omega(k^{-1})$ (Qu et al., 2025) and the previously best known upper bound $O(k^{-1/2})$ (L\'eger, 2021), and generalizes the analysis for the positive case by Dvurechensky et al. (2018).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Wang proves an O(k^{-1} log k) Sinkhorn rate in the asymptotically scalable case and extends the positive-entry analysis cleanly.

read the letter

The key point here is the O(k^{-1} log k) rate for Sinkhorn in the asymptotically scalable case. It improves the prior upper bound and nearly closes the gap to the lower bound from Qu et al. The extension avoids the uniform positivity requirement by using asymptotic control on the support via a limiting positive scaling vector with controlled growth. Then they adapt the potential function to absorb the log k term. The derivation for the l1 error bound looks direct and free of contradictions.

Referee Report

0 major / 3 minor

Summary. The paper proves that the Sinkhorn algorithm converges at rate O(k^{-1} log k) in ℓ1-norm marginal error for the asymptotically scalable case. It nearly closes the gap to the Ω(k^{-1}) lower bound of Qu et al. (2025), improves on the O(k^{-1/2}) bound of Léger (2021), and extends the positive-case analysis of Dvurechensky et al. (2018) by replacing uniform positivity with asymptotic control on the support via a refined potential.

Significance. If the derivation holds, the result is significant: it supplies the first almost-sharp rate for a broad class of transport problems that are not uniformly positive, using a precise definition of asymptotic scalability (existence of a limiting positive scaling vector with controlled growth) and a potential that absorbs the logarithmic factor. The manuscript includes the full proof extending the 2018 argument, which strengthens the contribution.

minor comments (3)

§2.2: the definition of the asymptotically scalable case is clear, but the growth control on the limiting vector could be stated with an explicit constant to facilitate verification in applications.
Theorem 3.1: the dependence of the hidden constant on the limiting vector is not made fully explicit; adding a remark on how it scales with the support size would improve readability.
Figure 1: the log-log plot of the error would benefit from an additional reference line at slope -1 to visually confirm the claimed rate.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of our manuscript and the recommendation for minor revision. The summary accurately reflects our main result on the almost-sharp O(k^{-1} log k) convergence rate for the Sinkhorn algorithm under asymptotic scalability.

read point-by-point responses

Referee: The paper proves that the Sinkhorn algorithm converges at rate O(k^{-1} log k) in ℓ1-norm marginal error for the asymptotically scalable case. It nearly closes the gap to the Ω(k^{-1}) lower bound of Qu et al. (2025), improves on the O(k^{-1/2}) bound of Léger (2021), and extends the positive-case analysis of Dvurechensky et al. (2018) by replacing uniform positivity with asymptotic control on the support via a refined potential.

Authors: We appreciate this concise summary of our contributions. The analysis indeed generalizes the potential-function argument of Dvurechensky et al. (2018) by introducing a refined potential that controls the logarithmic factor under the asymptotic scalability assumption (existence of a limiting positive scaling vector with controlled growth). This yields the stated rate while nearly matching the lower bound of Qu et al. (2025). revision: no

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper supplies a self-contained mathematical proof that extends the Dvurechensky et al. (2018) argument by replacing uniform positivity with asymptotic control on the limiting scaling vector. The O(k^{-1} log k) bound on ℓ1 marginal error is obtained via a refined potential function and direct estimates on the iterates; none of the load-bearing steps reduce to fitted parameters, self-definitions, or unverified self-citations. The lower-bound comparison to Qu et al. (2025) is external, and the analysis remains independent of the target rate itself.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms, or invented entities used in the proof.

pith-pipeline@v0.9.0 · 5382 in / 1016 out tokens · 74313 ms · 2026-05-15T07:20:22.218857+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We prove that the Sinkhorn algorithm converges at a rate of O(k^{-1} log k) in ℓ1-norm marginal error, in the asymptotically scalable case.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Ψ(f,g) = τ log ∑ e^{[-Cij+fi+gj]/τ} μi νj − μ⊤f − ν⊤g

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages · 1 internal anchor

[1]

Implications of convergence rates in Sinkhorn balancing

[Ach93] Eva Achilles. “Implications of convergence rates in Sinkhorn balancing”. In:Linear algebra and its applications187 (1993), pp. 109–112. [ALOW17] Zeyuan Allen-Zhu, Yuanzhi Li, Rafael Oliveira, and Avi Wigderson. “Much faster algorithms for matrix scaling”. In:2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). IEEE. 2017, pp....

work page arXiv 1993
[2]

A review of matrix scaling and Sinkhorn's normal form for matrices and positive maps

Cambridge University Press. 1995, pp. 31–55. [FL89] Joel Franklin and Jens Lorenz. “On the scaling of multidimensional matrices”. In: Linear Algebra and its applications114 (1989), pp. 717–735. [GN25] Promit Ghosal and Marcel Nutz. “On the convergence rate of Sinkhorn’s algorithm”. In:Mathematics of Operations Research(2025). [HHS24] Koyo Hayashi, Hiroshi...

work page internal anchor Pith review Pith/arXiv arXiv 1995
[3]

On Sinkhorn’s algorithm and choice modeling

[QGGU25] Zhaonan Qu, Alfred Galichon, Wenzhi Gao, and Johan Ugander. “On Sinkhorn’s algorithm and choice modeling”. In:Operations Research(2025). [SK67] Richard Sinkhorn and Paul Knopp. “Concerning nonnegative matrices and doubly stochastic matrices”. In:Pacific Journal of Mathematics21.2 (1967), pp. 343–348. [Sou91] George W Soules. “The rate of converge...

work page 2025

[1] [1]

Implications of convergence rates in Sinkhorn balancing

[Ach93] Eva Achilles. “Implications of convergence rates in Sinkhorn balancing”. In:Linear algebra and its applications187 (1993), pp. 109–112. [ALOW17] Zeyuan Allen-Zhu, Yuanzhi Li, Rafael Oliveira, and Avi Wigderson. “Much faster algorithms for matrix scaling”. In:2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS). IEEE. 2017, pp....

work page arXiv 1993

[2] [2]

A review of matrix scaling and Sinkhorn's normal form for matrices and positive maps

Cambridge University Press. 1995, pp. 31–55. [FL89] Joel Franklin and Jens Lorenz. “On the scaling of multidimensional matrices”. In: Linear Algebra and its applications114 (1989), pp. 717–735. [GN25] Promit Ghosal and Marcel Nutz. “On the convergence rate of Sinkhorn’s algorithm”. In:Mathematics of Operations Research(2025). [HHS24] Koyo Hayashi, Hiroshi...

work page internal anchor Pith review Pith/arXiv arXiv 1995

[3] [3]

On Sinkhorn’s algorithm and choice modeling

[QGGU25] Zhaonan Qu, Alfred Galichon, Wenzhi Gao, and Johan Ugander. “On Sinkhorn’s algorithm and choice modeling”. In:Operations Research(2025). [SK67] Richard Sinkhorn and Paul Knopp. “Concerning nonnegative matrices and doubly stochastic matrices”. In:Pacific Journal of Mathematics21.2 (1967), pp. 343–348. [Sou91] George W Soules. “The rate of converge...

work page 2025