Sample complexity of unbalanced entropic OT

Clarice Poon; Francisco Andrade; Gabriel Peyr\'e

arxiv: 2606.24987 · v1 · pith:ITYX2UILnew · submitted 2026-06-23 · 🧮 math.ST · cs.LG· stat.TH

Sample complexity of unbalanced entropic OT

Francisco Andrade , Gabriel Peyr\'e , Clarice Poon This is my paper

Pith reviewed 2026-06-25 22:07 UTC · model grok-4.3

classification 🧮 math.ST cs.LGstat.TH

keywords unbalanced optimal transportentropic regularizationsample complexityoptimal couplingdual formulationfinite-sample boundsstrong convexity

0 comments

The pith

Unbalanced entropic optimal transport admits high-probability finite-sample bounds on the optimal coupling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to control the sample complexity of unbalanced entropic optimal transport specifically for the optimal coupling, not merely the scalar cost value. It introduces a translation-invariant dual formulation whose intrinsic variables satisfy compactness and strong convexity. These geometric facts are then turned into explicit high-probability bounds that quantify how closely an empirical coupling approximates its population counterpart. A sympathetic reader cares because the bounds demonstrate that regularization and unbalanced penalties together reduce the number of samples needed for reliable transport estimation in high dimensions and keep the estimators compatible with fast algorithms.

Core claim

The paper claims that a translation-invariant dual formulation for unbalanced entropic OT yields compactness and strong convexity of the intrinsic dual variables, which in turn deliver high-probability finite-sample bounds on empirical couplings.

What carries the argument

The translation-invariant dual formulation that isolates intrinsic dual variables possessing compactness and strong convexity.

If this is right

Regularization reduces the samples required for stable coupling estimation.
The resulting bounds remain compatible with Sinkhorn-type solvers.
The estimates apply when mass is created or destroyed in the data.
The curse of dimensionality is softened for transport estimation in high dimensions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dual approach might yield rates for unbalanced OT without entropy.
The bounds could guide practical selection of regularization strength from sample size.
Synthetic experiments in moderate dimensions could check whether the predicted rates appear.

Load-bearing premise

The translation-invariant dual formulation produces compact and strongly convex intrinsic dual variables under the problem's marginal penalties and regularization parameters.

What would settle it

A high-dimensional numerical experiment in which the deviation between empirical and population couplings exceeds the derived high-probability bound for any choice of regularization parameter.

read the original abstract

Optimal transport (OT) has become a central language for comparing probability measures, but exact balanced OT is often both too rigid for data with missing, created, or destroyed mass and subject to unfavorable high-dimensional sample complexity. Entropic regularization and unbalanced relaxations address these limitations in complementary ways. Entropy smooths the geometry, improves statistical behavior, and enables fast Sinkhorn-type algorithms, while unbalanced marginal penalties replace hard conservation constraints by divergence terms adapted to noisy empirical data. This paper studies the sample complexity of entropic unbalanced OT at the level of the optimal coupling, rather than only the scalar transport value. We develop a translation-invariant dual formulation, prove compactness and strong convexity properties for the intrinsic dual variables, and convert these geometric estimates into high-probability finite-sample bounds for empirical couplings. The results clarify why regularization is a practical necessity in machine learning applications: it softens the curse of dimensionality, reduces the number of samples needed for stable transport estimation, and keeps the resulting estimators compatible with scalable Sinkhorn-type solvers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives finite-sample bounds on the empirical coupling for unbalanced entropic OT by introducing a translation-invariant dual and deriving compactness plus strong convexity for the intrinsic variables.

read the letter

This paper extends sample-complexity analysis to the coupling level in the unbalanced entropic OT setting. The central move is a translation-invariant dual formulation whose compactness and strong-convexity properties are turned into high-probability bounds on the empirical plan. That focus on the coupling itself, rather than only the scalar cost, is the clearest addition relative to earlier balanced or value-only results.

The approach lines up with standard concentration techniques in entropic OT and directly addresses a practical need: unbalanced penalties are common when data have missing or extra mass, and the bounds show how regularization softens the dimension dependence. The argument structure (dual reformulation to geometric estimates to concentration) is internally consistent and does not appear circular.

The main soft spot is that the bounds will depend on the precise form of the marginal penalties and the regularization parameter; explicit dependence or recovery of known balanced rates would make the contribution sharper, but this is a refinement rather than a flaw in the core claim. No load-bearing gaps are visible from the stated steps.

The work is aimed at researchers in statistical optimal transport and entropic methods for machine learning. Anyone needing finite-sample guarantees for Sinkhorn-type solvers on noisy or unbalanced data will find it useful. It deserves a serious referee because the technical steps are grounded and the target question is well-posed within the subfield.

Referee Report

0 major / 2 minor

Summary. The manuscript develops a translation-invariant dual formulation for unbalanced entropic optimal transport, establishes compactness and strong convexity properties of the intrinsic dual variables, and converts these geometric estimates into high-probability finite-sample bounds on the empirical couplings (rather than solely on the scalar transport value).

Significance. If the derivations hold, the work supplies a theoretical account of why entropic regularization is practically necessary for unbalanced OT in machine-learning settings: it softens the curse of dimensionality, lowers the sample size needed for stable coupling estimation, and preserves compatibility with Sinkhorn-type solvers. The focus on the coupling itself and the translation-invariant dual are positive features that align with standard concentration techniques in entropic OT.

minor comments (2)

[Abstract] Abstract: the statement of results would benefit from a single sentence indicating the form of the obtained bounds (e.g., rate in n, dependence on regularization and marginal penalties) and the precise assumptions on the marginals or cost function.
[Introduction] The manuscript would be strengthened by an explicit statement, early in the introduction or §2, of the precise high-probability bound that is ultimately proved (including the dependence on dimension, regularization parameter, and sample size).

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper's derivation proceeds from a translation-invariant dual formulation to compactness/strong-convexity properties of intrinsic dual variables and then to high-probability sample-complexity bounds on the empirical coupling. None of these steps reduces by construction to fitted parameters, self-citations, or renamed inputs; the geometric estimates are derived from the unbalanced entropic objective and marginal penalties, and the concentration bounds follow from standard empirical-process arguments applied to the dual variables. The provided abstract and reader summary contain no equations or claims that equate a prediction to its own fitting procedure or that import uniqueness via overlapping-author citations. The central claim therefore remains independent of the target result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based solely on the abstract; the translation-invariant dual and compactness/strong-convexity properties are introduced without explicit listing of free parameters or background axioms. Likely depends on standard assumptions from entropic OT literature (e.g., existence of dual potentials) but details unavailable.

pith-pipeline@v0.9.1-grok · 5704 in / 1163 out tokens · 18624 ms · 2026-06-25T22:07:34.593997+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 1 linked inside Pith

[1]

Andrade, G

F. Andrade, G. Peyr´ e, and C. Poon. Learning from samples: Inverse problems over measures via sharpened Fenchel–Young losses.arXiv preprint arXiv:2505.07124, 2025

arXiv 2025
[2]

Benamou, G

J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyr´ e. Iterative bregman pro- jections for regularized transportation problems.SIAM Journal on Scientific Computing, 37(2):A1111–A1138, 2015

2015
[3]

Bunne, L

C. Bunne, L. Papaxanthos, A. Krause, and M. Cuturi. Proximal optimal transport modeling of population dynamics. InProceedings of the 25th International Conference on Artificial Intelligence and Statistics, volume 151 ofProceedings of Machine Learning Research, pages 6511–6528. PMLR, 2022

2022
[4]

L. A. Caffarelli and R. J. McCann. Free boundaries in optimal transport and monge- ampere obstacle problems.Annals of Mathematics, 171(2):673–730, 2010

2010
[5]

Chizat, G

L. Chizat, G. Peyr´ e, B. Schmitzer, and F.-X. Vialard. Scaling algorithms for unbalanced optimal transport problems.Mathematics of computation, 87(314):2563–2609, 2018

2018
[6]

Chizat, G

L. Chizat, G. Peyr´ e, B. Schmitzer, and F.-X. Vialard. Unbalanced optimal transport: Dynamic and Kantorovich formulations.Journal of Functional Analysis, 274(11):3090– 3123, 2018

2018
[7]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. InAd- vances in Neural Information Processing Systems, volume 26, pages 2292–2300, 2013

2013
[8]

Feydy, T

J. Feydy, T. S´ ejourn´ e, F.-X. Vialard, S.-i. Amari, A. Trouv´ e, and G. Peyr´ e. Interpolating between optimal transport and MMD using Sinkhorn divergences. InProceedings of 21 the 22nd International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 2681–2690. PMLR, 2019

2019
[9]

A. Figalli. The optimal partial transport problem.Archive for rational mechanics and analysis, 195(2):533–560, 2010

2010
[10]

Fournier and A

N. Fournier and A. Guillin. On the rate of convergence in wasserstein distance of the empirical measure.Probability theory and related fields, 162(3–4):707–738, 2015

2015
[11]

Frogner, C

C. Frogner, C. Zhang, H. Mobahi, M. Araya, and T. A. Poggio. Learning with a wasser- stein loss. InAdvances in Neural Information Processing Systems, volume 28, 2015

2015
[12]

Genevay, L

A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyr´ e. Sample complexity of Sinkhorn divergences. InProceedings of the 22nd International Conference on Artificial Intelligence and Statistics, volume 89 ofProceedings of Machine Learning Research, pages 1574–1583. PMLR, 2019

2019
[13]

Genevay, G

A. Genevay, G. Peyr´ e, and M. Cuturi. Learning generative models with sinkhorn di- vergences. InProceedings of the 21st International Conference on Artificial Intelligence and Statistics, volume 84 ofProceedings of Machine Learning Research, pages 1608–1617. PMLR, 2018

2018
[14]

Kondratyev, L

S. Kondratyev, L. Monsaingeon, and D. Vorotnikov. A new optimal transport distance on the space of finite radon measures.arXiv preprint arXiv:1505.07746, 2015

Pith/arXiv arXiv 2015
[15]

Liero, A

M. Liero, A. Mielke, and G. Savar´ e. Optimal entropy-transport problems and a new Hellinger–Kantorovich distance between positive measures.Inventiones Mathematicae, 211(3):969–1117, 2018

2018
[16]

Mena and J

G. Mena and J. Niles-Weed. Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem. InAdvances in Neural Information Processing Systems, volume 32, 2019

2019
[17]

Pariset, Y.-P

M. Pariset, Y.-P. Hsieh, C. Bunne, A. Krause, and V. De Bortoli. Unbalanced diffusion Schr¨ odinger bridge.arXiv preprint arXiv:2306.09099, 2023

arXiv 2023
[18]

Peyr´ e and M

G. Peyr´ e and M. Cuturi. Computational optimal transport: With applications to data science.Foundations and Trends in Machine Learning, 11(5–6):355–607, 2019

2019
[19]

Rigollet and A

P. Rigollet and A. J. Stromme. On the sample complexity of entropic optimal transport. arXiv preprint arXiv:2206.13472, 2022. 22

arXiv 2022
[20]

Schiebinger, J

G. Schiebinger, J. Shu, M. Tabaka, B. Cleary, V. Subramanian, A. Solomon, J. Gould, S. Liu, S. Lin, P. Berube, et al. Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming.Cell, 176(4):928–943, 2019

2019
[21]

S´ ejourn´ e, J

T. S´ ejourn´ e, J. Feydy, F.-X. Vialard, A. Trouv´ e, and G. Peyr´ e. Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

arXiv 1910
[22]

S´ ejourn´ e, G

T. S´ ejourn´ e, G. Peyr´ e, and F.-X. Vialard. Unbalanced optimal transport, from theory to numerics.Handbook of Numerical Analysis, 24:407–471, 2023

2023
[23]

S´ ejourn´ e, F.-X

T. S´ ejourn´ e, F.-X. Vialard, and G. Peyr´ e. Faster unbalanced optimal transport: Trans- lation invariant sinkhorn and 1-d frank-wolfe. InProceedings of the 25th International Conference on Artificial Intelligence and Statistics, volume 151 ofProceedings of Machine Learning Research, pages 4995–5021. PMLR, 2022

2022
[24]

Sinkhorn

R. Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices.Ann. Math. Statist., 35:876–879, 1964

1964
[25]

Zhang, T

Z. Zhang, T. Li, and P. Zhou. Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport.arXiv preprint arXiv:2410.00844, 2024. A Concentration bounds Proposition 3(McDiarmid’s inequality).LetZ 1, . . . , Zn be independent random variables taking values in a measurable spaceZ, and letF:Z n →Rsatisfy the bounded-difference...

arXiv 2024

[1] [1]

Andrade, G

F. Andrade, G. Peyr´ e, and C. Poon. Learning from samples: Inverse problems over measures via sharpened Fenchel–Young losses.arXiv preprint arXiv:2505.07124, 2025

arXiv 2025

[2] [2]

Benamou, G

J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyr´ e. Iterative bregman pro- jections for regularized transportation problems.SIAM Journal on Scientific Computing, 37(2):A1111–A1138, 2015

2015

[3] [3]

Bunne, L

C. Bunne, L. Papaxanthos, A. Krause, and M. Cuturi. Proximal optimal transport modeling of population dynamics. InProceedings of the 25th International Conference on Artificial Intelligence and Statistics, volume 151 ofProceedings of Machine Learning Research, pages 6511–6528. PMLR, 2022

2022

[4] [4]

L. A. Caffarelli and R. J. McCann. Free boundaries in optimal transport and monge- ampere obstacle problems.Annals of Mathematics, 171(2):673–730, 2010

2010

[5] [5]

Chizat, G

L. Chizat, G. Peyr´ e, B. Schmitzer, and F.-X. Vialard. Scaling algorithms for unbalanced optimal transport problems.Mathematics of computation, 87(314):2563–2609, 2018

2018

[6] [6]

Chizat, G

L. Chizat, G. Peyr´ e, B. Schmitzer, and F.-X. Vialard. Unbalanced optimal transport: Dynamic and Kantorovich formulations.Journal of Functional Analysis, 274(11):3090– 3123, 2018

2018

[7] [7]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. InAd- vances in Neural Information Processing Systems, volume 26, pages 2292–2300, 2013

2013

[8] [8]

Feydy, T

J. Feydy, T. S´ ejourn´ e, F.-X. Vialard, S.-i. Amari, A. Trouv´ e, and G. Peyr´ e. Interpolating between optimal transport and MMD using Sinkhorn divergences. InProceedings of 21 the 22nd International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 2681–2690. PMLR, 2019

2019

[9] [9]

A. Figalli. The optimal partial transport problem.Archive for rational mechanics and analysis, 195(2):533–560, 2010

2010

[10] [10]

Fournier and A

N. Fournier and A. Guillin. On the rate of convergence in wasserstein distance of the empirical measure.Probability theory and related fields, 162(3–4):707–738, 2015

2015

[11] [11]

Frogner, C

C. Frogner, C. Zhang, H. Mobahi, M. Araya, and T. A. Poggio. Learning with a wasser- stein loss. InAdvances in Neural Information Processing Systems, volume 28, 2015

2015

[12] [12]

Genevay, L

A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyr´ e. Sample complexity of Sinkhorn divergences. InProceedings of the 22nd International Conference on Artificial Intelligence and Statistics, volume 89 ofProceedings of Machine Learning Research, pages 1574–1583. PMLR, 2019

2019

[13] [13]

Genevay, G

A. Genevay, G. Peyr´ e, and M. Cuturi. Learning generative models with sinkhorn di- vergences. InProceedings of the 21st International Conference on Artificial Intelligence and Statistics, volume 84 ofProceedings of Machine Learning Research, pages 1608–1617. PMLR, 2018

2018

[14] [14]

Kondratyev, L

S. Kondratyev, L. Monsaingeon, and D. Vorotnikov. A new optimal transport distance on the space of finite radon measures.arXiv preprint arXiv:1505.07746, 2015

Pith/arXiv arXiv 2015

[15] [15]

Liero, A

M. Liero, A. Mielke, and G. Savar´ e. Optimal entropy-transport problems and a new Hellinger–Kantorovich distance between positive measures.Inventiones Mathematicae, 211(3):969–1117, 2018

2018

[16] [16]

Mena and J

G. Mena and J. Niles-Weed. Statistical bounds for entropic optimal transport: sample complexity and the central limit theorem. InAdvances in Neural Information Processing Systems, volume 32, 2019

2019

[17] [17]

Pariset, Y.-P

M. Pariset, Y.-P. Hsieh, C. Bunne, A. Krause, and V. De Bortoli. Unbalanced diffusion Schr¨ odinger bridge.arXiv preprint arXiv:2306.09099, 2023

arXiv 2023

[18] [18]

Peyr´ e and M

G. Peyr´ e and M. Cuturi. Computational optimal transport: With applications to data science.Foundations and Trends in Machine Learning, 11(5–6):355–607, 2019

2019

[19] [19]

Rigollet and A

P. Rigollet and A. J. Stromme. On the sample complexity of entropic optimal transport. arXiv preprint arXiv:2206.13472, 2022. 22

arXiv 2022

[20] [20]

Schiebinger, J

G. Schiebinger, J. Shu, M. Tabaka, B. Cleary, V. Subramanian, A. Solomon, J. Gould, S. Liu, S. Lin, P. Berube, et al. Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming.Cell, 176(4):928–943, 2019

2019

[21] [21]

S´ ejourn´ e, J

T. S´ ejourn´ e, J. Feydy, F.-X. Vialard, A. Trouv´ e, and G. Peyr´ e. Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

arXiv 1910

[22] [22]

S´ ejourn´ e, G

T. S´ ejourn´ e, G. Peyr´ e, and F.-X. Vialard. Unbalanced optimal transport, from theory to numerics.Handbook of Numerical Analysis, 24:407–471, 2023

2023

[23] [23]

S´ ejourn´ e, F.-X

T. S´ ejourn´ e, F.-X. Vialard, and G. Peyr´ e. Faster unbalanced optimal transport: Trans- lation invariant sinkhorn and 1-d frank-wolfe. InProceedings of the 25th International Conference on Artificial Intelligence and Statistics, volume 151 ofProceedings of Machine Learning Research, pages 4995–5021. PMLR, 2022

2022

[24] [24]

Sinkhorn

R. Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices.Ann. Math. Statist., 35:876–879, 1964

1964

[25] [25]

Zhang, T

Z. Zhang, T. Li, and P. Zhou. Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport.arXiv preprint arXiv:2410.00844, 2024. A Concentration bounds Proposition 3(McDiarmid’s inequality).LetZ 1, . . . , Zn be independent random variables taking values in a measurable spaceZ, and letF:Z n →Rsatisfy the bounded-difference...

arXiv 2024