On McDiarmid's Inequality under Dependence via Approximate Tensorization of Entropy

Valentin Roth

arxiv: 2606.12720 · v1 · pith:TIP5323Inew · submitted 2026-06-10 · 🧮 math.PR · math.ST· stat.ML· stat.TH

On McDiarmid's Inequality under Dependence via Approximate Tensorization of Entropy

Valentin Roth This is my paper

Pith reviewed 2026-06-27 08:03 UTC · model grok-4.3

classification 🧮 math.PR math.STstat.MLstat.TH

keywords McDiarmid inequalityapproximate tensorization of entropyentropy methodconcentration inequalitiesdependent random variablesGaussian measuresDvoretzky-Kiefer-Wolfowitz inequalitystochastic localization

0 comments

The pith

Approximate tensorization of entropy implies McDiarmid's inequality for dependent variables via the entropy method, with the constant scaling as the condition number of the covariance for non-isotropic Gaussians.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the approximate tensorization of entropy property suffices to derive McDiarmid-type concentration bounds even when the underlying variables are dependent. The derivation proceeds through the entropy method and produces explicit constants controlled by the tensorization factor. For non-isotropic Gaussian vectors the bound scales with the condition number of the covariance matrix. The same framework is applied to obtain concentration results for the sign of a Gaussian vector, for dependent Erdős-Rényi graphs, and for a Dvoretzky-Kiefer-Wolfowitz inequality on the empirical distribution function that achieves the 1/sqrt(n) rate under weak dependence.

Core claim

Approximate tensorization of entropy implies McDiarmid's inequality via the Entropy Method. For X ~ N(μ, Σ) this yields a McDiarmid constant of order the condition number of Σ. The ATE property is obtained independently via stochastic localization and also follows from a more general result on the Gibbs sampler for strongly log-concave and log-smooth measures, which extends the concentration statement to that broader class.

What carries the argument

Approximate tensorization of entropy (ATE), the multiplicative control of joint entropy by a sum of conditional entropies that lets the entropy method produce concentration inequalities under dependence.

If this is right

McDiarmid-type bounds hold for every measure obeying approximate tensorization of entropy, with the multiplicative constant fixed by the tensorization factor.
For non-isotropic Gaussians the McDiarmid constant is of order the condition number of Σ.
Concentration inequalities for sign(X) follow directly for Gaussian vectors X.
A Dvoretzky-Kiefer-Wolfowitz inequality holds at the expected 1/sqrt(n) rate for observations drawn from any measure with ATE and continuous marginal CDFs.
The same concentration applies to Erdős-Rényi graphs whose edges satisfy the ATE property.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The stochastic-localization derivation of ATE may extend to other families of log-concave measures beyond Gaussians.
The resulting bounds could be used to analyze concentration in additional dependent graph models or in statistical procedures with weakly dependent samples.
Connections between ATE and other functional inequalities such as Poincaré or log-Sobolev may yield further concentration statements.
Numerical verification on high-condition-number covariance matrices would test whether the predicted scaling is sharp.

Load-bearing premise

The probability measures under study satisfy the approximate tensorization of entropy property.

What would settle it

Exhibit a measure satisfying approximate tensorization of entropy yet violating the corresponding McDiarmid bound, or compute the exact tail constant for a Gaussian vector whose covariance has large condition number and check whether the observed constant exceeds the predicted order.

read the original abstract

We argue that dependent versions of McDiarmid's inequality are a useful but underutilized tool in mathematical statistics, learning theory and theoretical computer science. To make this point, we first highlight that approximate tensorization of entropy (ATE) implies McDiarmid's via the Entropy Method. Second, we derive McDiarmid's inequality for non-isotropic Gaussian random vectors $X \sim \mathcal N(\mu, \Sigma)$ through ATE with a constant of the order of the condition number of $\Sigma$. We both independently obtain this ATE through a simple application of stochastic localization and also discuss how a more general ATE for the Gibbs sampler due to Ascolani et al., 2026 generalizes McDiarmid's-like concentration to strongly log-concave and log-smooth probability measures. We then apply the resulting concentration inequalities to resolve a question on the concentration of $\operatorname{sign}(X)$ posed by Simone Bombari, investigate Erd\H{o}s-R\'enyi graphs under dependence and prove a Dvoretzky-Kiefer-Wolfowitz-type inequality for observations from a joint measure fulfilling ATE and continuous marginal CDFs. For the class of strongly log-concave and log-smooth measures, this result improves upon a prior Dvoretzky-Kiefer-Wolfowitz-type inequality for non-i.i.d. observations due to Bobkov and G\"otze, 2010, by establishing the expected $1/\sqrt{n}$-rate of convergence under weak dependence instead of $n^{-1/3}$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows ATE implies McDiarmid bounds, gives an explicit cond(Σ) version for non-isotropic Gaussians, and improves the DKW rate to 1/sqrt(n) under that assumption.

read the letter

The central point is that approximate tensorization of entropy yields McDiarmid-type inequalities for dependent variables through the entropy method, and this produces a concrete 1/sqrt(n) rate for a DKW inequality when the joint measure satisfies ATE and has continuous marginals.

What is new is the explicit McDiarmid bound for X ~ N(μ, Σ) with constant scaling like the condition number of Σ, derived independently via stochastic localization, plus the rate improvement over Bobkov-Götze 2010 for the class of strongly log-concave log-smooth measures. The applications to concentration of sign(X) and to Erdős-Rényi graphs under dependence follow directly once the bound is available.

The paper does well in spelling out the ATE-to-McDiarmid step and in obtaining the Gaussian case without relying on prior ATE results for that setting. The DKW improvement is a clear technical step when the assumption holds.

The main soft spot is that all the concentration statements rest on the ATE property, which the paper verifies for Gaussians and for the Ascolani et al. class but which still needs to be checked case by case. This limits how far the bounds travel beyond those families. The derivations themselves look standard and do not appear to introduce extra error terms or circular steps.

This work is for people working on concentration inequalities, empirical processes, or dependent data in probability and statistics. A reader already familiar with entropy methods will see the value in the explicit constants and the rate gain.

It deserves peer review because the claims are grounded, the improvement is stated cleanly, and the assumptions are laid out explicitly.

Referee Report

0 major / 3 minor

Summary. The paper claims that approximate tensorization of entropy (ATE) implies McDiarmid-type concentration via the entropy method. It derives an ATE-based McDiarmid inequality for X ~ N(μ, Σ) with constant of order cond(Σ), obtained independently via stochastic localization (and notes a generalization from Ascolani et al. 2026 for strongly log-concave measures). Applications include concentration of sign(X), dependent Erdős-Rényi graphs, and a DKW-type inequality for ATE-satisfying measures with continuous marginals that achieves the 1/√n rate under weak dependence (improving Bobkov-Götze 2010).

Significance. If the derivations hold, the work usefully connects ATE to McDiarmid inequalities under dependence, with the Gaussian result and the improved DKW rate providing concrete tools for statistics and TCS. The independent stochastic-localization derivation of the ATE factor and the explicit applications (sign(X), ER graphs, DKW) are strengths that make the contribution self-contained and falsifiable.

minor comments (3)

[§3] §3 (Gaussian case): the statement that the ATE constant is 'of order the condition number' should include an explicit upper bound (e.g., in terms of λ_max/λ_min) rather than asymptotic order, to make the comparison with isotropic McDiarmid immediate.
[Abstract, §1] Abstract and §1: the citation 'Ascolani et al., 2026' appears to be a forward reference; confirm the year and add a note on whether the present derivation is independent or recovers the same constant.
[§5] §5 (DKW application): the proof sketch that ATE + continuous marginal CDFs yields the 1/√n rate should explicitly cite the entropy-method step that replaces the n^{-1/3} barrier of Bobkov-Götze.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment and recommendation of minor revision. The report correctly identifies the core contributions: the link from approximate tensorization of entropy (ATE) to McDiarmid-type bounds via the entropy method, the Gaussian result with condition-number dependence obtained independently via stochastic localization, the generalization via Ascolani et al. (2026), and the concrete applications yielding improved rates.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation proceeds from the standard implication that approximate tensorization of entropy yields McDiarmid-type bounds via the entropy method, followed by an independent derivation of the ATE factor for non-isotropic Gaussians obtained directly via stochastic localization; this step is self-contained and does not reduce to any fitted input, self-definition, or load-bearing self-citation. The additional reference to Ascolani et al. 2026 supplies a generalization but is not required for the core Gaussian or application results, which rest on the paper's own localization argument and the entropy-method implication. All subsequent applications (sign(X), ER graphs, DKW) follow directly once ATE is granted, with no equations or claims that collapse by construction to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the ATE property holding for the target measures; this is treated as a domain assumption derived via stochastic localization or prior results.

axioms (1)

domain assumption Approximate tensorization of entropy holds for the probability measures considered (non-isotropic Gaussians and strongly log-concave log-smooth measures).
This property is invoked to obtain McDiarmid via the entropy method and is the load-bearing premise for all subsequent inequalities.

pith-pipeline@v0.9.1-grok · 5815 in / 1113 out tokens · 34172 ms · 2026-06-27T08:03:48.782428+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

63 extracted references · 11 canonical work pages · 3 internal anchors

[1]

Stability Results in Learning Theory

Alexander Rakhlin Sayan Mukherjee, Tomaso Poggio (2005). “Stability Results in Learning Theory”. In:Analysis and Applications

2005
[2]

Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses

Anari, Nima, Vishesh Jain, Frederic Koehler, Huy Tuan Pham, and Thuy-Duong Vuong (2024). “Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses”. In: Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)

2024
[3]

Trickle-Down in Localization Schemes and Applications

Anari, Nima, Frederic Koehler, and Thuy-Duong Vuong (2024). “Trickle-Down in Localization Schemes and Applications”. In:Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC). Association for Computing Machinery

2024
[4]

Entropy contraction of the Gibbs sampler under log-concavity

Ascolani, Filippo, Hugo Lavenant, and Giacomo Zanella (2026). “Entropy contraction of the Gibbs sampler under log-concavity”. In:arXiv preprint, arXiv:2410.00858

work page arXiv 2026
[5]

Weighted sums of certain dependent random variables

Azuma, Kazuoki (1967). “Weighted sums of certain dependent random variables”. In:Tohoku Math- ematical Journal

1967
[6]

MIT press

Bach, Francis (2024).Learning theory from first principles. MIT press

2024
[7]

Springer Cham

Bakry, Dominique, Ivan Gentil, and Michel Ledoux (2014).Analysis and Geometry of Markov Dif- fusion Operators. Springer Cham

2014
[8]

On mixing of Markov chains: coupling, spectral independence, and entropy factorization

Blanca, Antonio, Pietro Caputo, Zongchen Chen, Daniel Parisi, Daniel ˇStefankoviˇ c, and Eric Vigoda (2022). “On mixing of Markov chains: coupling, spectral independence, and entropy factorization”. In:Electronic Journal of Probability

2022
[9]

Concentration of empirical distribution functions with applications to non-i.i.d. models

Bobkov, Sergey and Friedrich G¨ otze (2010). “Concentration of empirical distribution functions with applications to non-i.i.d. models”. In:Bernoulli16.4, pp. 1385–1414

2010
[10]

Memorization and optimization in deep neural networks with minimum over-parameterization

Bombari, Simone, Mohammad Hossein Amani, and Marco Mondelli (2022). “Memorization and optimization in deep neural networks with minimum over-parameterization”. In:Advances in Neural Information Processing Systems (NeurIPS)

2022
[11]

2013).Concentration Inequalities: A Nonasymptotic Theory of Independence

Boucheron, St´ ephane, G´ abor Lugosi, and Pascal Massart (Feb. 2013).Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press

2013
[12]

Stability and generalization

Bousquet, Olivier and Andr´ e Elisseeff (2002). “Stability and generalization”. In:Journal of Machine Learning Research

2002
[13]

Lecture Notes

Caputo, Pietro (2022).Lecture Notes on Entropy and Markov Chains. Lecture Notes. Universit` a Roma Tre

2022
[14]

Approximate tensorization of entropy at high temperature

Caputo, Pietro, Georg Menz, and Prasad Tetali (2015). “Approximate tensorization of entropy at high temperature”. In:arXiv preprint, arXiv:1405.0608

work page internal anchor Pith review Pith/arXiv arXiv 2015
[15]

Entropy factorization via curvature

Caputo, Pietro and Justin Salez (2026). “Entropy factorization via curvature”. In:Journal of Func- tional Analysis

2026
[16]

Theoretical Analysis of Cross-Validation for Esti- mating the Risk of thek-Nearest Neighbor Classifier

Celisse, Alain and Tristan Mary-Huard (2018). “Theoretical Analysis of Cross-Validation for Esti- mating the Risk of thek-Nearest Neighbor Classifier”. In:Journal of Machine Learning Research

2018
[17]

Concentration inequalities for random fields via coupling

Chazottes, J-R, Pierre Collet, Christof K¨ ulske, and Frank Redig (2007). “Concentration inequalities for random fields via coupling”. In:Probability Theory and Related Fields. 24

2007
[18]

An almost constant lower bound of the isoperimetric coefficient in the KLS conjecture

Chen, Yuansi (2021). “An almost constant lower bound of the isoperimetric coefficient in the KLS conjecture”. In:Geometric and Functional Analysis

2021
[19]

Localization Schemes: A Framework for Proving Mixing Bounds for Markov Chains

Chen, Yuansi and Ronen Eldan (2022). “Localization Schemes: A Framework for Proving Mixing Bounds for Markov Chains”. In:2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)

2022
[20]

Optimal Mixing of Glauber Dynamics: En- tropy Factorization via High-Dimensional Expansion

Chen, Zongchen, Kuikui Liu, and Eric Vigoda (2021). “Optimal Mixing of Glauber Dynamics: En- tropy Factorization via High-Dimensional Expansion”. In:SIAM Journal on Computing

2021
[21]

An extension of McDiarmid’s inequality

Combes, Richard (2024). “An extension of McDiarmid’s inequality”. In:arXiv preprint, arXiv:1511.05240

work page arXiv 2024
[22]

Transportation cost-information inequalities and appli- cations to random dynamical systems and diffusions

Djellout, H., A. Guillin, and L. Wu (2004). “Transportation cost-information inequalities and appli- cations to random dynamical systems and diffusions”. In:The Annals of Probability

2004
[23]

Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator

Dvoretzky, Aryeh, Jack Kiefer, and Jacob Wolfowitz (1956). “Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator”. In:The Annals of Mathematical Statistics. El Alaoui, Ahmed and Andrea Montanari (2022). “An Information-Theoretic View of Stochastic Localization”. In:IEEE Transactions on Information The...

1956
[24]

Thin shell implies spectral gap up to polylog via a stochastic localization scheme

Eldan, Ronen (2013). “Thin shell implies spectral gap up to polylog via a stochastic localization scheme”. In:Geometric and Functional Analysis

2013
[25]

Log concavity and concentration of Lipschitz functions on the Boolean hypercube

Eldan, Ronen and Omer Shamir (2022). “Log concavity and concentration of Lipschitz functions on the Boolean hypercube”. In:Journal of Functional Analysis

2022
[26]

A spectral condition for spectral gap: fast mixing in high-temperature Ising models

Eldan, Ronen, Ofer Zeitouni, and Frederic Koehler (2022). “A spectral condition for spectral gap: fast mixing in high-temperature Ising models”. In:Probability Theory and Related Fields

2022
[27]

Concentration without Independence via Information Measures

Esposito, Amedeo Roberto and Marco Mondelli (2023). “Concentration without Independence via Information Measures”. In:2023 IEEE International Symposium on Information Theory (ISIT). G¨ otze, Friedrich, Holger Sambale, and Arthur Sinulis (2019). “Higher order concentration for func- tions of weakly dependent random variables”. In:Electronic Journal of Probability

2023
[28]

Logarithmic Sobolev Inequalities

Gross, Leonard (1975). “Logarithmic Sobolev Inequalities”. In:American Journal of Mathematics

1975
[29]

Probability Inequalities for Sums of Bounded Random Variables

Hoeffding, Wassily (1963). “Probability Inequalities for Sums of Bounded Random Variables”. In: Journal of the American Statistical Association

1963
[30]

Sampling from spherical spin glasses in total variation via algorithmic stochastic localization

Huang, Brice, Andrea Montanari, and Huy Tuan Pham (2024). “Sampling from spherical spin glasses in total variation via algorithmic stochastic localization”. In:arXiv preprint arXiv:2404.15651

work page arXiv 2024
[31]

A slightly improved bound for the KLS constant

Jambulapati, Arun, Yin Tat Lee, and Santosh S Vempala (2022). “A slightly improved bound for the KLS constant”. In:arXiv preprint arXiv:2208.11644

work page arXiv 2022
[32]

Large deviations for sums of partly dependent random variables

Janson, Svante (2004). “Large deviations for sums of partly dependent random variables”. In:Ran- dom Structures & Algorithms

2004
[33]

Isoperimetric problems for convex bodies and a localization lemma

Kannan, Ravi, L´ aszl´ o Lov´ asz, and Mikl´ os Simonovits (1995). “Isoperimetric problems for convex bodies and a localization lemma”. In:Discrete & Computational Geometry

1995
[34]

Logarithmic bounds for isoperimetry and slices of convex sets

Klartag, Bo’az (2023). “Logarithmic bounds for isoperimetry and slices of convex sets”. In:arXiv preprint arXiv:2303.14938

work page arXiv 2023
[35]

Bourgain’s slicing problem and KLS isoperimetry up to polylog

Klartag, Bo’az and Joseph Lehec (2022). “Bourgain’s slicing problem and KLS isoperimetry up to polylog”. In:Geometric and functional analysis

2022
[36]

Concentration of Measure Without Independence: A Unified Approach Via the Martingale Method

Kontorovich, Aryeh and Maxim Raginsky (2017). “Concentration of Measure Without Independence: A Unified Approach Via the Martingale Method”. In:Convexity and Concentration. Springer New York

2017
[37]

Concentration inequalities for dependent random variables via the martingale method

Kontorovich, Leonid (Aryeh) and Kavita Ramanan (2008). “Concentration inequalities for dependent random variables via the martingale method”. In:The Annals of Probability. 25

2008
[38]

Kutin, Samuel (2002).Extensions to McDiarmid’s inequality when differences are bounded with high probability

2002
[39]

American Mathematical Soci- ety

Ledoux, Michel (2001).The Concentration of Measure Phenomenon. American Mathematical Soci- ety

2001
[40]

Eldan’s Stochastic Localization and the KLS Hyperplane Conjecture: An Improved Lower Bound for Expansion

Lee, Yin Tat and Santosh Srinivas Vempala (2017). “Eldan’s Stochastic Localization and the KLS Hyperplane Conjecture: An Improved Lower Bound for Expansion”. In:2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)

2017
[41]

A Modern Theory of Cross-Validation through the Lens of Stability

Lei, Jing (2025). “A Modern Theory of Cross-Validation through the Lens of Stability”. In:arXiv preprint, arXiv:2505.23592

work page arXiv 2025
[42]

An inequality for relative entropy and logarithmic Sobolev inequalities in Euclidean spaces

Marton, Katalin (2013). “An inequality for relative entropy and logarithmic Sobolev inequalities in Euclidean spaces”. In:Journal of Functional Analysis

2013
[43]

Logarithmic Sobolev inequalities in discrete product spaces: a proof by a transportation cost distance

Marton, Katalin (2015). “Logarithmic Sobolev inequalities in discrete product spaces: a proof by a transportation cost distance”. In:arXiv preprint, arXiv:1507.02803

work page internal anchor Pith review Pith/arXiv arXiv 2015
[44]

The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality

Massart, Pascal (1990). “The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality”. In: The Annals of Probability

1990
[45]

Springer Berlin, Heidelberg

Massart, Pascal (2003).Concentration Inequalities and Model Selection. Springer Berlin, Heidelberg

2003
[46]

On the method of bounded differences

McDiarmid, Colin (1989). “On the method of bounded differences”. In:Surveys in Combinatorics, 1989: Invited Papers at the Twelfth British Combinatorial Conference. Cambridge University Press

1989
[47]

Concentration

McDiarmid, Colin (1998). “Concentration”. In:Probabilistic Methods for Algorithmic Discrete Math- ematics. Springer Berlin Heidelberg, pp. 195–248

1998
[48]

MIT press

Mohri, Mehryar, Afshin Rostamizadeh, and Ameet Talwalkar (2018).Foundations of machine learn- ing. MIT press

2018
[49]

Montanari

Montanari, Andrea (2023). “Sampling, diffusions, and stochastic localization”. In:arXiv preprint arXiv:2305.10690

work page arXiv 2023
[50]

Cambridge University Press

Motwani, Rajeev and Prabhakar Raghavan (1995).Randomized Algorithms. Cambridge University Press

1995
[51]

The convex distance inequality for dependent random variables, with ap- plications to the stochastic travelling salesman and other problems

Paulin, Daniel (2014). “The convex distance inequality for dependent random variables, with ap- plications to the stochastic travelling salesman and other problems”. In:Electronic Journal of Probability

2014
[52]

“Concentration of Measure Inequalities in Information

Raginsky, Maxim and Igal Sason (2013). “Concentration of Measure Inequalities in Information

2013
[53]

Perspectives on Stochastic Localization

Shi, Bobby, Kevin Tian, and Matthew S Zhang (2025). “Perspectives on Stochastic Localization”. In:arXiv preprint arXiv:2510.04460

work page arXiv 2025
[54]

A new look at independence

Talagrand, Michel (1996). “A new look at independence”. In:The Annals of Probability

1996
[55]

Generalization error bounds for classifiers trained with interdependent data

Usunier, Nicolas, Massih R. Amini, and Patrick Gallinari (2005). “Generalization error bounds for classifiers trained with interdependent data”. In:Advances in Neural Information Processing Systems (NeurIPS)

2005
[56]

On Hoeffding’s Inequality for Dependent Random Variables

Vaart, Aad W van der and Jon A Wellner (1996).Weak convergence. Springer. van de Geer, Sara (2002). “On Hoeffding’s Inequality for Dependent Random Variables”. In:Em- pirical Process Techniques for Dependent Data. Birkh¨ auser. van de Geer, Sara (2020).Empirical Process Theory. Lecture Notes. ETH Zurich

1996
[57]

Cambridge University Press

Vershynin, Roman (2018).High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press

2018
[58]

Springer Berlin, Heidelberg

Villani, C´ edric (2009).Optimal Transport - Old and New. Springer Berlin, Heidelberg

2009
[59]

(2019).High-Dimensional Statistics: A Non-Asymptotic Viewpoint

Wainwright, Martin J. (2019).High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cam- bridge University Press. 26

2019
[60]

Convergence rate and concentration inequalities for Gibbs sampling in high dimension

Wang, Neng-Yi and Liming Wu (2014). “Convergence rate and concentration inequalities for Gibbs sampling in high dimension”. In:Bernoulli

2014
[61]

Poincar´ e and transportation inequalities for Gibbs measures under the Do- brushin uniqueness condition

Wu, Liming (2006). “Poincar´ e and transportation inequalities for Gibbs measures under the Do- brushin uniqueness condition”. In:The Annals of Probability

2006
[62]

McDiarmid-Type Inequali- ties for Graph-Dependent Variables and Stability Bounds

Zhang, Rui (Ray), Xingwu Liu, Yuyi Wang, and Liwei Wang (2019). “McDiarmid-Type Inequali- ties for Graph-Dependent Variables and Stability Bounds”. In:Advances in Neural Information Processing Systems (NeurIPS)

2019
[63]

On the Subgaussianity of Quantized Linear Maps: An AI-Assisted Note

Zou, Guangyi and Roman Vershynin (2026). “On the Subgaussianity of Quantized Linear Maps: An AI-Assisted Note”. In:arXiv preprint arXiv:2605.27563. 27

work page internal anchor Pith review Pith/arXiv arXiv 2026

[1] [1]

Stability Results in Learning Theory

Alexander Rakhlin Sayan Mukherjee, Tomaso Poggio (2005). “Stability Results in Learning Theory”. In:Analysis and Applications

2005

[2] [2]

Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses

Anari, Nima, Vishesh Jain, Frederic Koehler, Huy Tuan Pham, and Thuy-Duong Vuong (2024). “Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses”. In: Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA)

2024

[3] [3]

Trickle-Down in Localization Schemes and Applications

Anari, Nima, Frederic Koehler, and Thuy-Duong Vuong (2024). “Trickle-Down in Localization Schemes and Applications”. In:Proceedings of the 56th Annual ACM Symposium on Theory of Computing (STOC). Association for Computing Machinery

2024

[4] [4]

Entropy contraction of the Gibbs sampler under log-concavity

Ascolani, Filippo, Hugo Lavenant, and Giacomo Zanella (2026). “Entropy contraction of the Gibbs sampler under log-concavity”. In:arXiv preprint, arXiv:2410.00858

work page arXiv 2026

[5] [5]

Weighted sums of certain dependent random variables

Azuma, Kazuoki (1967). “Weighted sums of certain dependent random variables”. In:Tohoku Math- ematical Journal

1967

[6] [6]

MIT press

Bach, Francis (2024).Learning theory from first principles. MIT press

2024

[7] [7]

Springer Cham

Bakry, Dominique, Ivan Gentil, and Michel Ledoux (2014).Analysis and Geometry of Markov Dif- fusion Operators. Springer Cham

2014

[8] [8]

On mixing of Markov chains: coupling, spectral independence, and entropy factorization

Blanca, Antonio, Pietro Caputo, Zongchen Chen, Daniel Parisi, Daniel ˇStefankoviˇ c, and Eric Vigoda (2022). “On mixing of Markov chains: coupling, spectral independence, and entropy factorization”. In:Electronic Journal of Probability

2022

[9] [9]

Concentration of empirical distribution functions with applications to non-i.i.d. models

Bobkov, Sergey and Friedrich G¨ otze (2010). “Concentration of empirical distribution functions with applications to non-i.i.d. models”. In:Bernoulli16.4, pp. 1385–1414

2010

[10] [10]

Memorization and optimization in deep neural networks with minimum over-parameterization

Bombari, Simone, Mohammad Hossein Amani, and Marco Mondelli (2022). “Memorization and optimization in deep neural networks with minimum over-parameterization”. In:Advances in Neural Information Processing Systems (NeurIPS)

2022

[11] [11]

2013).Concentration Inequalities: A Nonasymptotic Theory of Independence

Boucheron, St´ ephane, G´ abor Lugosi, and Pascal Massart (Feb. 2013).Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press

2013

[12] [12]

Stability and generalization

Bousquet, Olivier and Andr´ e Elisseeff (2002). “Stability and generalization”. In:Journal of Machine Learning Research

2002

[13] [13]

Lecture Notes

Caputo, Pietro (2022).Lecture Notes on Entropy and Markov Chains. Lecture Notes. Universit` a Roma Tre

2022

[14] [14]

Approximate tensorization of entropy at high temperature

Caputo, Pietro, Georg Menz, and Prasad Tetali (2015). “Approximate tensorization of entropy at high temperature”. In:arXiv preprint, arXiv:1405.0608

work page internal anchor Pith review Pith/arXiv arXiv 2015

[15] [15]

Entropy factorization via curvature

Caputo, Pietro and Justin Salez (2026). “Entropy factorization via curvature”. In:Journal of Func- tional Analysis

2026

[16] [16]

Theoretical Analysis of Cross-Validation for Esti- mating the Risk of thek-Nearest Neighbor Classifier

Celisse, Alain and Tristan Mary-Huard (2018). “Theoretical Analysis of Cross-Validation for Esti- mating the Risk of thek-Nearest Neighbor Classifier”. In:Journal of Machine Learning Research

2018

[17] [17]

Concentration inequalities for random fields via coupling

Chazottes, J-R, Pierre Collet, Christof K¨ ulske, and Frank Redig (2007). “Concentration inequalities for random fields via coupling”. In:Probability Theory and Related Fields. 24

2007

[18] [18]

An almost constant lower bound of the isoperimetric coefficient in the KLS conjecture

Chen, Yuansi (2021). “An almost constant lower bound of the isoperimetric coefficient in the KLS conjecture”. In:Geometric and Functional Analysis

2021

[19] [19]

Localization Schemes: A Framework for Proving Mixing Bounds for Markov Chains

Chen, Yuansi and Ronen Eldan (2022). “Localization Schemes: A Framework for Proving Mixing Bounds for Markov Chains”. In:2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS)

2022

[20] [20]

Optimal Mixing of Glauber Dynamics: En- tropy Factorization via High-Dimensional Expansion

Chen, Zongchen, Kuikui Liu, and Eric Vigoda (2021). “Optimal Mixing of Glauber Dynamics: En- tropy Factorization via High-Dimensional Expansion”. In:SIAM Journal on Computing

2021

[21] [21]

An extension of McDiarmid’s inequality

Combes, Richard (2024). “An extension of McDiarmid’s inequality”. In:arXiv preprint, arXiv:1511.05240

work page arXiv 2024

[22] [22]

Transportation cost-information inequalities and appli- cations to random dynamical systems and diffusions

Djellout, H., A. Guillin, and L. Wu (2004). “Transportation cost-information inequalities and appli- cations to random dynamical systems and diffusions”. In:The Annals of Probability

2004

[23] [23]

Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator

Dvoretzky, Aryeh, Jack Kiefer, and Jacob Wolfowitz (1956). “Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator”. In:The Annals of Mathematical Statistics. El Alaoui, Ahmed and Andrea Montanari (2022). “An Information-Theoretic View of Stochastic Localization”. In:IEEE Transactions on Information The...

1956

[24] [24]

Thin shell implies spectral gap up to polylog via a stochastic localization scheme

Eldan, Ronen (2013). “Thin shell implies spectral gap up to polylog via a stochastic localization scheme”. In:Geometric and Functional Analysis

2013

[25] [25]

Log concavity and concentration of Lipschitz functions on the Boolean hypercube

Eldan, Ronen and Omer Shamir (2022). “Log concavity and concentration of Lipschitz functions on the Boolean hypercube”. In:Journal of Functional Analysis

2022

[26] [26]

A spectral condition for spectral gap: fast mixing in high-temperature Ising models

Eldan, Ronen, Ofer Zeitouni, and Frederic Koehler (2022). “A spectral condition for spectral gap: fast mixing in high-temperature Ising models”. In:Probability Theory and Related Fields

2022

[27] [27]

Concentration without Independence via Information Measures

Esposito, Amedeo Roberto and Marco Mondelli (2023). “Concentration without Independence via Information Measures”. In:2023 IEEE International Symposium on Information Theory (ISIT). G¨ otze, Friedrich, Holger Sambale, and Arthur Sinulis (2019). “Higher order concentration for func- tions of weakly dependent random variables”. In:Electronic Journal of Probability

2023

[28] [28]

Logarithmic Sobolev Inequalities

Gross, Leonard (1975). “Logarithmic Sobolev Inequalities”. In:American Journal of Mathematics

1975

[29] [29]

Probability Inequalities for Sums of Bounded Random Variables

Hoeffding, Wassily (1963). “Probability Inequalities for Sums of Bounded Random Variables”. In: Journal of the American Statistical Association

1963

[30] [30]

Sampling from spherical spin glasses in total variation via algorithmic stochastic localization

Huang, Brice, Andrea Montanari, and Huy Tuan Pham (2024). “Sampling from spherical spin glasses in total variation via algorithmic stochastic localization”. In:arXiv preprint arXiv:2404.15651

work page arXiv 2024

[31] [31]

A slightly improved bound for the KLS constant

Jambulapati, Arun, Yin Tat Lee, and Santosh S Vempala (2022). “A slightly improved bound for the KLS constant”. In:arXiv preprint arXiv:2208.11644

work page arXiv 2022

[32] [32]

Large deviations for sums of partly dependent random variables

Janson, Svante (2004). “Large deviations for sums of partly dependent random variables”. In:Ran- dom Structures & Algorithms

2004

[33] [33]

Isoperimetric problems for convex bodies and a localization lemma

Kannan, Ravi, L´ aszl´ o Lov´ asz, and Mikl´ os Simonovits (1995). “Isoperimetric problems for convex bodies and a localization lemma”. In:Discrete & Computational Geometry

1995

[34] [34]

Logarithmic bounds for isoperimetry and slices of convex sets

Klartag, Bo’az (2023). “Logarithmic bounds for isoperimetry and slices of convex sets”. In:arXiv preprint arXiv:2303.14938

work page arXiv 2023

[35] [35]

Bourgain’s slicing problem and KLS isoperimetry up to polylog

Klartag, Bo’az and Joseph Lehec (2022). “Bourgain’s slicing problem and KLS isoperimetry up to polylog”. In:Geometric and functional analysis

2022

[36] [36]

Concentration of Measure Without Independence: A Unified Approach Via the Martingale Method

Kontorovich, Aryeh and Maxim Raginsky (2017). “Concentration of Measure Without Independence: A Unified Approach Via the Martingale Method”. In:Convexity and Concentration. Springer New York

2017

[37] [37]

Concentration inequalities for dependent random variables via the martingale method

Kontorovich, Leonid (Aryeh) and Kavita Ramanan (2008). “Concentration inequalities for dependent random variables via the martingale method”. In:The Annals of Probability. 25

2008

[38] [38]

Kutin, Samuel (2002).Extensions to McDiarmid’s inequality when differences are bounded with high probability

2002

[39] [39]

American Mathematical Soci- ety

Ledoux, Michel (2001).The Concentration of Measure Phenomenon. American Mathematical Soci- ety

2001

[40] [40]

Eldan’s Stochastic Localization and the KLS Hyperplane Conjecture: An Improved Lower Bound for Expansion

Lee, Yin Tat and Santosh Srinivas Vempala (2017). “Eldan’s Stochastic Localization and the KLS Hyperplane Conjecture: An Improved Lower Bound for Expansion”. In:2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS)

2017

[41] [41]

A Modern Theory of Cross-Validation through the Lens of Stability

Lei, Jing (2025). “A Modern Theory of Cross-Validation through the Lens of Stability”. In:arXiv preprint, arXiv:2505.23592

work page arXiv 2025

[42] [42]

An inequality for relative entropy and logarithmic Sobolev inequalities in Euclidean spaces

Marton, Katalin (2013). “An inequality for relative entropy and logarithmic Sobolev inequalities in Euclidean spaces”. In:Journal of Functional Analysis

2013

[43] [43]

Logarithmic Sobolev inequalities in discrete product spaces: a proof by a transportation cost distance

Marton, Katalin (2015). “Logarithmic Sobolev inequalities in discrete product spaces: a proof by a transportation cost distance”. In:arXiv preprint, arXiv:1507.02803

work page internal anchor Pith review Pith/arXiv arXiv 2015

[44] [44]

The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality

Massart, Pascal (1990). “The Tight Constant in the Dvoretzky-Kiefer-Wolfowitz Inequality”. In: The Annals of Probability

1990

[45] [45]

Springer Berlin, Heidelberg

Massart, Pascal (2003).Concentration Inequalities and Model Selection. Springer Berlin, Heidelberg

2003

[46] [46]

On the method of bounded differences

McDiarmid, Colin (1989). “On the method of bounded differences”. In:Surveys in Combinatorics, 1989: Invited Papers at the Twelfth British Combinatorial Conference. Cambridge University Press

1989

[47] [47]

Concentration

McDiarmid, Colin (1998). “Concentration”. In:Probabilistic Methods for Algorithmic Discrete Math- ematics. Springer Berlin Heidelberg, pp. 195–248

1998

[48] [48]

MIT press

Mohri, Mehryar, Afshin Rostamizadeh, and Ameet Talwalkar (2018).Foundations of machine learn- ing. MIT press

2018

[49] [49]

Montanari

Montanari, Andrea (2023). “Sampling, diffusions, and stochastic localization”. In:arXiv preprint arXiv:2305.10690

work page arXiv 2023

[50] [50]

Cambridge University Press

Motwani, Rajeev and Prabhakar Raghavan (1995).Randomized Algorithms. Cambridge University Press

1995

[51] [51]

The convex distance inequality for dependent random variables, with ap- plications to the stochastic travelling salesman and other problems

Paulin, Daniel (2014). “The convex distance inequality for dependent random variables, with ap- plications to the stochastic travelling salesman and other problems”. In:Electronic Journal of Probability

2014

[52] [52]

“Concentration of Measure Inequalities in Information

Raginsky, Maxim and Igal Sason (2013). “Concentration of Measure Inequalities in Information

2013

[53] [53]

Perspectives on Stochastic Localization

Shi, Bobby, Kevin Tian, and Matthew S Zhang (2025). “Perspectives on Stochastic Localization”. In:arXiv preprint arXiv:2510.04460

work page arXiv 2025

[54] [54]

A new look at independence

Talagrand, Michel (1996). “A new look at independence”. In:The Annals of Probability

1996

[55] [55]

Generalization error bounds for classifiers trained with interdependent data

Usunier, Nicolas, Massih R. Amini, and Patrick Gallinari (2005). “Generalization error bounds for classifiers trained with interdependent data”. In:Advances in Neural Information Processing Systems (NeurIPS)

2005

[56] [56]

On Hoeffding’s Inequality for Dependent Random Variables

Vaart, Aad W van der and Jon A Wellner (1996).Weak convergence. Springer. van de Geer, Sara (2002). “On Hoeffding’s Inequality for Dependent Random Variables”. In:Em- pirical Process Techniques for Dependent Data. Birkh¨ auser. van de Geer, Sara (2020).Empirical Process Theory. Lecture Notes. ETH Zurich

1996

[57] [57]

Cambridge University Press

Vershynin, Roman (2018).High-Dimensional Probability: An Introduction with Applications in Data Science. Cambridge University Press

2018

[58] [58]

Springer Berlin, Heidelberg

Villani, C´ edric (2009).Optimal Transport - Old and New. Springer Berlin, Heidelberg

2009

[59] [59]

(2019).High-Dimensional Statistics: A Non-Asymptotic Viewpoint

Wainwright, Martin J. (2019).High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cam- bridge University Press. 26

2019

[60] [60]

Convergence rate and concentration inequalities for Gibbs sampling in high dimension

Wang, Neng-Yi and Liming Wu (2014). “Convergence rate and concentration inequalities for Gibbs sampling in high dimension”. In:Bernoulli

2014

[61] [61]

Poincar´ e and transportation inequalities for Gibbs measures under the Do- brushin uniqueness condition

Wu, Liming (2006). “Poincar´ e and transportation inequalities for Gibbs measures under the Do- brushin uniqueness condition”. In:The Annals of Probability

2006

[62] [62]

McDiarmid-Type Inequali- ties for Graph-Dependent Variables and Stability Bounds

Zhang, Rui (Ray), Xingwu Liu, Yuyi Wang, and Liwei Wang (2019). “McDiarmid-Type Inequali- ties for Graph-Dependent Variables and Stability Bounds”. In:Advances in Neural Information Processing Systems (NeurIPS)

2019

[63] [63]

On the Subgaussianity of Quantized Linear Maps: An AI-Assisted Note

Zou, Guangyi and Roman Vershynin (2026). “On the Subgaussianity of Quantized Linear Maps: An AI-Assisted Note”. In:arXiv preprint arXiv:2605.27563. 27

work page internal anchor Pith review Pith/arXiv arXiv 2026