pith. sign in

arxiv: 2604.08283 · v1 · submitted 2026-04-09 · 🧮 math.AP · cs.NA· math.NA

A convergence rate for the entropic JKO scheme

Pith reviewed 2026-05-10 17:40 UTC · model grok-4.3

classification 🧮 math.AP cs.NAmath.NA
keywords JKO schemeentropic regularizationWasserstein gradient flowsconvergence rateconvexityPDEs
0
0 comments X

The pith

The entropic JKO scheme converges to the original PDE solution at a specific rate when the regularization parameter and time step both approach zero under convexity assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that replacing the Wasserstein distance with its entropic regularization in the JKO scheme still yields convergence to the target PDE. The convergence holds with an explicit rate as both the regularization strength alpha and the time step tau go to zero. The result follows from a new inequality that bounds how much the entropic iterates differ from the classical ones. Sympathetic readers care because the entropic version is much easier to compute numerically, so this justifies its use for simulating gradient flows in probability measures.

Core claim

Under convexity assumptions, the entropic JKO scheme with ε = α τ converges to the solution of the initial PDE with a certain rate as α and τ tend to zero. This is a consequence of a new bound between the classical and entropic JKO schemes.

What carries the argument

The new bound between the classical JKO scheme and its entropic counterpart, which quantifies their difference in terms of alpha and tau.

Load-bearing premise

Convexity assumptions on the energy functional are needed for the new bound between classical and entropic JKO schemes to hold.

What would settle it

Numerical computation of the error between the entropic JKO iterates and the true PDE solution for a convex energy, measured as alpha and tau decrease, would confirm or refute the claimed rate.

read the original abstract

The so-called JKO scheme, named after Jordan, Kinderlehrer and Otto, provides a variational way to construct discrete time approximations of certain partial differential equations (PDEs) appearing as gradient flows in the space of probability measures equipped with the Wasserstein metric. The method consists of an implicit Euler scheme, which can be implemented numerically. Yet, in practice, evaluating the Wasserstein distance can be numerically expensive. To address this problem, a common strategy introduced by Peyr\'e in 2015 and which has been shown to produce faster computations, is to replace the Wasserstein distance with its entropic regularization, also known as the Schr\"odinger cost. In 2026, the first author, Hraivoronska and Santambrogio, proved that if the regularization parameter $\varepsilon$ is proportional to the time step $\tau$, that is, $\varepsilon = \alpha \tau$ for some $\alpha > 0$, then as $\tau \to 0$, this change results in adding to the limiting PDE the additional linear diffusion term $\frac{\alpha}{2} \Delta \rho$. Our goal in this article is to provide a convergence rate under convexity assumptions between the entropic JKO scheme and the solution of the initial PDE as both $\alpha$ and $\tau$ tend to zero. This will appear as a consequence of a new bound between the classical and entropic JKO schemes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript proves a quantitative convergence rate for the entropic JKO scheme to the solution of the underlying PDE, under convexity assumptions on the driving energy. The rate is obtained as a consequence of a new bound comparing the classical JKO iterates to their entropic counterparts; the known convergence rate of the classical scheme is then used to control the distance to the continuous limit as both the regularization parameter α and the time step τ tend to zero.

Significance. If the result holds, the work supplies an explicit error estimate that justifies entropic regularization for numerical approximation of Wasserstein gradient flows when α is taken small. The comparison bound between the two discrete schemes appears to be the main technical novelty and may be of independent interest in optimal transport. The argument structure is standard and leverages existing theory without introducing new ad-hoc parameters.

major comments (1)
  1. [§3] §3 (main theorem): the precise dependence of the convergence rate on α and τ (including any factors depending on the convexity modulus or initial data) must be stated explicitly in the theorem; the current phrasing 'a certain rate' is too vague for a quantitative result.
minor comments (2)
  1. [Abstract] Abstract: replace 'a certain rate' with a brief indication of the order (e.g., O(√τ + α)) to give readers an immediate sense of the result.
  2. [Introduction] Notation: ensure the entropic cost and the relation ε = ατ are introduced with the same symbols used in the main statements.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading of our manuscript and the positive overall assessment. We address the major comment below and will implement the requested clarification.

read point-by-point responses
  1. Referee: [§3] §3 (main theorem): the precise dependence of the convergence rate on α and τ (including any factors depending on the convexity modulus or initial data) must be stated explicitly in the theorem; the current phrasing 'a certain rate' is too vague for a quantitative result.

    Authors: We agree that the main theorem statement should explicitly display the dependence of the error bound on α, τ, the convexity modulus of the driving energy, and suitable norms of the initial datum. The proof already yields an explicit rate (obtained by combining the new comparison estimate between classical and entropic JKO schemes with the known rate for the classical scheme), but the theorem was phrased concisely. In the revised manuscript we will restate the theorem with the precise quantitative bound, including the explicit dependence on all relevant quantities. This is a minor textual clarification that leaves the arguments unchanged. revision: yes

Circularity Check

0 steps flagged

No significant circularity; minor self-citation not load-bearing

full rationale

The paper establishes a new quantitative bound between classical and entropic JKO schemes under convexity assumptions on the energy, then combines this bound with the known convergence rate of the classical JKO scheme to the target PDE. This yields the claimed rate for the entropic scheme to the original PDE as both α and τ tend to zero. The self-citation in the abstract to prior work by the first author et al. (on the fixed-α limit adding a diffusion term) provides context and motivation but is not invoked as a load-bearing step in the derivation of the new bound or the rate; the central argument relies on an independent comparison and standard external results on classical JKO convergence. No self-definitional reductions, fitted predictions, or ansatz smuggling appear in the provided structure.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, invented entities, or non-standard axioms are mentioned; the work relies on standard convexity assumptions from the field of optimal transport and gradient flows.

pith-pipeline@v0.9.0 · 5561 in / 965 out tokens · 69129 ms · 2026-05-10T17:40:11.479961+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages

  1. [1]

    Adams, N

    S. Adams, N. Dirr, M. A. Peletier, and J. Zimmer. From a large-deviations principle to the Wasser- stein gradient flow: a new micro-macro passage.Communications in Mathematical Physics, 307:791– 815, 2011

  2. [2]

    Ambrosio and N

    L. Ambrosio and N. Gigli. A User’s Guide to Optimal Transport. InModelling and Optimisation of Flows on Networks, pages 1–155. 2013

  3. [3]

    Ambrosio, N

    L. Ambrosio, N. Gigli, and G. Savaré.Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media, 2005. 46 AYMERIC BARADAT AND SOFIANE CHERF

  4. [4]

    Baradat, A

    A. Baradat, A. Hraivoronska, and F. Santambrogio. Using Sinkhorn in the JKO scheme adds linear diffusion, 2025

  5. [5]

    H. H. Bauschke and P. L. Combettes.Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer, New York, 2nd edition, 2017

  6. [6]

    Benamou and Y

    J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem.Numerische Mathematik, 84(3):375–393, 2000

  7. [7]

    Benamou, G

    J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré. Iterative Bregman projections for regularized transportation problems.SIAM Journal on Scientific Computing, 37(2):A1111–A1138, 2015

  8. [8]

    Benamou, G

    J.-D. Benamou, G. Carlier, and L. Nenna. Generalized incompressible flows, multi-marginal trans- port and Sinkhorn algorithm.Numerische Mathematik, 142(1):33–54, 2019

  9. [9]

    Y. Brenier. Polar factorization and monotone rearrangement of vector-valued functions.Communi- cations on Pure and Applied Mathematics, 44(4):375–417, 1991

  10. [10]

    Brezis.Functional analysis, Sobolev spaces and partial differential equations

    H. Brezis.Functional analysis, Sobolev spaces and partial differential equations. New York, NY: Springer, 2011

  11. [11]

    Carlier, V

    G. Carlier, V. Duval, G. Peyré, and B. Schmitzer. Convergence of Entropic Schemes for Optimal Transport and Gradient Flows.SIAM Journal on Mathematical Analysis, 49(2):1385–1418, 2017

  12. [12]

    Carlier, K

    G. Carlier, K. Eichinger, and A. Kroshnin. Entropic-Wasserstein barycenters: PDE characterization, regularity, and CLT.SIAM J. Math. Anal., 53(5):5880–5914, 2021

  13. [13]

    Conforti and L

    G. Conforti and L. Tamanini. A formula for the time derivative of the entropic cost and applications. J. Funct. Anal., 280(11), 2021

  14. [14]

    M. Cuturi. Sinkhorn Distances: Lightspeed Computation of Optimal Transport. InAdvances in Neural Information Processing Systems, volume 26, 2013

  15. [15]

    M. H. Duong, V. Laschos, and M. Renger. Wasserstein gradient flows from large deviations of many-particle limits.ESAIM Control Optim. Calc. Var., 19(4):1166–1188, 2013

  16. [16]

    Erbar, J

    M. Erbar, J. Maas, and D. R. M. Renger. From large deviations to Wasserstein gradient flows in multiple dimensions.Electron. Commun. Probab., 20, 2015

  17. [17]

    Gentil, C

    I. Gentil, C. Léonard, and L. Ripani. About the analogy between optimal transport and minimal entropy.Annales de la Faculté des sciences de Toulouse : Mathématiques, Ser. 6, 26(3):569–600, 2017

  18. [18]

    Jordan, D

    R. Jordan, D. Kinderlehrer, and F. Otto. The variational formulation of the Fokker–Planck equation. SIAM journal on mathematical analysis, 29(1):1–17, 1998

  19. [19]

    Kallenberg.Foundations of Modern Probability

    O. Kallenberg.Foundations of Modern Probability. Springer, New York, 2 edition, 2002

  20. [20]

    C. Léonard. From the Schrödinger problem to the Monge–Kantorovich problem.Journal of Func- tional Analysis, 262(4):1879–1920, 2012

  21. [21]

    C. Léonard. A survey of the Schrödinger problem and some of its connections with optimal transport. Discrete and Continuous Dynamical Systems, 34(4):1533–1574, 2014

  22. [22]

    Malamut and M

    H. Malamut and M. Sylvestre. Convergence rates of the regularized optimal transport: disentangling suboptimality and entropy.SIAM J. Math. Anal., 57(3):2533–2558, 2025

  23. [23]

    R. J. McCann. A convexity principle for interacting gases.Adv. Math., 128(1):153–179, 1997

  24. [24]

    F. Otto. Evolution of microstructure in unstable porous media flow: A relaxational approach. Communications on Pure and Applied Mathematics, 52(7):873–915, 1999

  25. [25]

    G. Peyré. Entropic Approximation of Wasserstein Gradient Flows.SIAM Journal on Imaging Sciences, 8(4):2323–2351, 2015

  26. [26]

    Santambrogio

    F. Santambrogio. Optimal transport for applied mathematicians.Birkäuser, NY, 55(58-63):94, 2015

  27. [27]

    Sinkhorn

    R. Sinkhorn. Diagonal Equivalence to Matrices with Prescribed Row and Column Sums.The American Mathematical Monthly, 74(4):402–405, 1967. Universite Claude Bernard Lyon 1, CNRS, Centrale Lyon, INSA Lyon, Université Jean Monnet, ICJ UMR5208, 43 bd du 11 Novembre 1918, 69622 Villeurbanne, France Email address:{baradat,cherf}@math.univ-lyon1.fr