pith. sign in

arxiv: 2605.03100 · v2 · submitted 2026-05-04 · 🧮 math.PR · math.ST· stat.TH

Berry-Esseen bounds for multivariate martingale difference sequences in the Kolmogorov distance

Pith reviewed 2026-05-08 17:31 UTC · model grok-4.3

classification 🧮 math.PR math.STstat.TH
keywords Berry-Esseen boundsmartingale difference sequencesKolmogorov distanceGaussian approximationhigh-dimensional probabilityMarkov chainscentral limit theorem
0
0 comments X

The pith

Martingale difference sequences in high dimensions admit Gaussian approximations in Kolmogorov distance at rate n to the power -1/4 with only polylog dependence on dimension.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives Gaussian approximation bounds for the normalized sums of finite-length martingale difference sequences taking values in d-dimensional space. These bounds are measured in the Kolmogorov distance, which is the supremum of the absolute difference between the cumulative distribution functions of the sum and a matching Gaussian. The error scales as the fourth root of one over the sequence length and grows only polylogarithmically with the dimension under suitable conditions on the martingales. The results are then applied to obtain high-dimensional Berry-Esseen bounds over hyper-rectangles when the martingales arise from Markov chains.

Core claim

We derive new Gaussian approximation for finite martingale difference sequences in R^d with respect to the Kolmogorov distance. Under appropriate conditions, our bounds exhibit a dependence of order n^{-1/4} on the length of the sequence and of order polylog(d) on the dimension. As an application, we derive a high-dimensional Berry-Esseen bound over hyper-rectangles for martingale sequences generated from Markov chains.

What carries the argument

Kolmogorov distance between the law of the normalized martingale sum and a centered Gaussian with the same covariance, serving as the error metric that yields the stated rates.

If this is right

  • High-dimensional Berry-Esseen bounds hold over hyper-rectangles for martingales arising from Markov chains.
  • The approximation rate depends on sequence length only through the fourth root and on dimension only through polylog factors.
  • The bounds apply to finite-length sequences without requiring asymptotic regimes.
  • The results provide a quantitative central limit theorem for dependent vector-valued processes under the stated conditions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The polylog dimension dependence may permit uniform inference over many coordinates in time-series settings where classical bounds would explode with d.
  • Similar techniques could be tested on other forms of weak dependence beyond Markov chains to check whether the n^{-1/4} rate persists.
  • The Kolmogorov metric focus suggests direct applicability to probability statements involving rectangles or orthants in high dimensions.

Load-bearing premise

The martingale difference sequences obey moment or boundedness conditions sufficient to control their dependence and tail behavior.

What would settle it

A concrete martingale difference sequence satisfying the moment conditions for which the Kolmogorov distance to its Gaussian limit fails to improve at rate n^{-1/4} or worse than polylog(d).

read the original abstract

We derive new Gaussian approximation for finite martingale difference sequences in $\mathbb{R}^d$ with respect to the Kolmogorov distance. Under appropriate conditions, our bounds exhibit a dependence of order $n^{-1/4}$ on the length of the sequence and of order $\mathrm{polylog}(d)$ on the dimension. As an application, we derive a high-dimensional Berry-Esseen bound over hyper-rectangles for martingale sequences generated from Markov chains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper derives new Berry-Esseen bounds for multivariate martingale difference sequences in R^d with respect to the Kolmogorov distance. Under conditions including conditional third-moment bounds, uniform integrability of the quadratic variation, and mild non-degeneracy on the limiting covariance, the bounds achieve order n^{-1/4} dependence on sequence length n and polylog(d) on dimension d. The n^{-1/4} rate follows from a truncation argument combined with a multivariate Stein-method bound, while the dimension factor uses dyadic chaining over a net of hyper-rectangles. An application yields high-dimensional bounds over hyper-rectangles for martingales generated by Markov chains, with moment conditions verified from ergodicity assumptions.

Significance. If the stated conditions hold, the results advance Gaussian approximation theory for dependent sequences in high dimensions by providing explicit dimension dependence and a concrete Markov-chain application. The transparent trade-off for the n^{-1/4} rate and the use of Stein's method with chaining are strengths; the work supplies falsifiable rates that can be checked in applications.

major comments (1)
  1. [Theorem 2.3] Proof of Theorem 2.3: the n^{-1/4} rate is obtained by passing from the smoothed to the unsmoothed Kolmogorov distance after truncation; while the square-root factor is documented, the manuscript should explicitly state whether this rate is optimal or can be improved to n^{-1/3} under stronger moment assumptions (e.g., bounded fourth moments).
minor comments (2)
  1. [Abstract] The abstract refers to 'appropriate conditions' without listing them; adding a one-sentence summary of the key assumptions (conditional third moments, uniform integrability, non-degeneracy) would improve accessibility.
  2. [Section 4] Section 4: the verification that ergodicity implies the required uniform integrability of the quadratic variation is direct, but a short remark on the role of the moment assumptions on the chain would clarify the argument for readers.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and the recommendation of minor revision. We address the single major comment below.

read point-by-point responses
  1. Referee: [Theorem 2.3] Proof of Theorem 2.3: the n^{-1/4} rate is obtained by passing from the smoothed to the unsmoothed Kolmogorov distance after truncation; while the square-root factor is documented, the manuscript should explicitly state whether this rate is optimal or can be improved to n^{-1/3} under stronger moment assumptions (e.g., bounded fourth moments).

    Authors: We thank the referee for this observation. The n^{-1/4} rate arises from balancing the truncation error (controlled via the conditional third-moment assumption) against the smoothing parameter used in the multivariate Stein-method bound, followed by the square-root factor when removing the smoothing to recover the Kolmogorov distance. Under the paper's stated assumptions, this rate is the one delivered by the truncation-plus-smoothing argument. Whether the rate can be improved to n^{-1/3} (or better) under stronger conditions such as bounded fourth moments would require a different technique, for instance one that avoids truncation or employs higher-order smoothing; we have not developed such an argument here. We will add a brief remark after the proof of Theorem 2.3 noting that the obtained rate is tied to the third-moment and truncation framework, and that faster rates under fourth-moment assumptions remain open for future work. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper derives its Berry-Esseen bounds for multivariate martingale difference sequences via Stein's method, truncation arguments, and dyadic chaining over hyper-rectangles, all under explicitly stated conditions such as conditional third-moment bounds and uniform integrability of quadratic variation. The n^{-1/4} rate emerges directly from the smoothing step in the Kolmogorov distance (incurring a square-root factor) and the net cardinality control in the chaining argument, without any fitted parameters renamed as predictions or self-definitional reductions. The Markov-chain application verifies the moment conditions from ergodicity assumptions with no hidden uniformity or self-citation load-bearing. No step in the claimed derivation chain reduces by construction to its inputs; the results are obtained from independent probabilistic inequalities.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard definitions and moment conditions from probability theory for martingales; no free parameters or invented entities are introduced in the abstract.

axioms (2)
  • standard math Martingale difference sequences satisfy E[X_{i+1} | past] = 0
    Core definition invoked for the central limit theorem approximation.
  • domain assumption Appropriate moment or boundedness conditions hold
    Required for the stated bounds to be valid, as noted in the abstract.

pith-pipeline@v0.9.0 · 5380 in / 1197 out tokens · 46266 ms · 2026-05-08T17:31:20.415088+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

83 extracted references · 83 canonical work pages

  1. [1]

    2026 , eprint=

    Gaussian Approximation for Asynchronous Q-learning , author=. 2026 , eprint=

  2. [2]

    Probability Theory and Related Fields , year=

    Stein's method for diffusion approximations , author=. Probability Theory and Related Fields , year=

  3. [3]

    arXiv preprint arXiv:0902.0333 , year=

    On Stein's method for multivariate normal approximation , author=. arXiv preprint arXiv:0902.0333 , year=

  4. [4]

    The Annals of Probability , volume=

    On the rate of convergence in the multivariate CLT , author=. The Annals of Probability , volume=. 1991 , publisher=

  5. [5]

    An Exposition of G

    Rabi Bhattacharya and Susan Holmes , journal=. An Exposition of G

  6. [6]

    , booktitle =

    Dvoretzky, A. , booktitle =. Asymptotic normality for sums of dependent random variables , volume =

  7. [7]

    Cattaneo and Ricardo P

    Matias D. Cattaneo and Ricardo P. Masini and William G. Underwood , title =. The Annals of Statistics , number =. 2025 , doi =

  8. [8]

    The Annals of Probability , number =

    Erich Haeusler , title =. The Annals of Probability , number =. 1988 , doi =

  9. [9]

    1972 , school =

    Rates of Convergence in the Central Limit Theorem for Dependent Variables , author =. 1972 , school =

  10. [10]

    Bulletin of Mathematical Statistics , volume=

    Rates of convergence in central limit theorem for martingale differences , author=. Bulletin of Mathematical Statistics , volume=. 1979 , publisher=

  11. [11]

    A Berry-Esseen bound of order 1/

    Wu, Songqi and Sang, Hailin and others , journal=. A Berry-Esseen bound of order 1/. 2020 , publisher=

  12. [12]

    The Annals of Probability , volume=

    A note on exact convergence rates in some martingale central limit theorems , author=. The Annals of Probability , volume=. 1996 , publisher=

  13. [13]

    Bernoulli , volume =

    On the rate of convergence in the martingale central limit theorem , author =. Bernoulli , volume =. 2013 , publisher =. doi:10.3150/12-BEJ417 , url =

  14. [14]

    Journal of Mathematical Analysis and Applications , volume =

    Fan, Xiequan , title =. Journal of Mathematical Analysis and Applications , volume =. 2019 , doi =

  15. [15]

    Some bounds on the rate of convergence in the

    Rinott, Yosef and Rotar, Vladimir I , journal=. Some bounds on the rate of convergence in the. 2000 , publisher=

  16. [16]

    Bernoulli , volume=

    Exact convergence rates in the central limit theorem for a class of martingales , author=. Bernoulli , volume=. 2007 , publisher=. doi:10.3150/07-BEJ6116 , url=

  17. [17]

    2014 , publisher=

    Martingale Limit Theory and Its Application , author=. 2014 , publisher=

  18. [18]

    From Stein's Method to Universality , year =

    Normal Approximations with Malliavin Calculus. From Stein's Method to Universality , year =

  19. [19]

    2025 , eprint=

    Gaussian Approximation and Multiplier Bootstrap for Stochastic Gradient Descent , author=. 2025 , eprint=

  20. [20]

    Berry , journal =

    Andrew C. Berry , journal =. The Accuracy of the Gaussian Approximation to the Sum of Independent Variates , urldate =

  21. [21]

    A high dimensional Central Limit Theorem for martingales, with applications to context tree models , author=

  22. [22]

    Yurinskii's Coupling for Martingales , author=

  23. [23]

    Lopes , year=

    Miles E. Lopes , year=. Central Limit Theorem and Bootstrap Approximation in High Dimensions: Near 1/. 2009.06004 , archivePrefix=

  24. [24]

    On the Maximal Perimeter of a Convex Set in R\^

    Nazarov, Fedor , booktitle=. On the Maximal Perimeter of a Convex Set in R\^. 2003 , organization=

  25. [25]

    Information and Inference: A Journal of the IMA , volume =

    Neeman, Joe and Shi, Bobby and Ward, Rachel , title =. Information and Inference: A Journal of the IMA , volume =. 2024 , month =. doi:10.1093/imaiai/iaae032 , url =

  26. [26]

    2018 , eprint=

    Regularity of solutions of the Stein equation and rates in the multivariate central limit theorem , author=. 2018 , eprint=

  27. [27]

    Electronic Communications in Probability , number =

    Roberto Oliveira , title =. Electronic Communications in Probability , number =. 2010 , doi =

  28. [28]

    Machine learning , volume=

    Learning to predict by the methods of temporal differences , author=. Machine learning , volume=. 1988 , publisher=

  29. [29]

    2009 , url=

    TRACE INEQUALITIES AND QUANTUM ENTROPY: An introductory course , author=. 2009 , url=

  30. [30]

    Journal of Mathematical Physics , volume =

    Mean Entropy of States in Quantum-Statistical Mechanics , author =. Journal of Mathematical Physics , volume =. 1968 , doi =

  31. [31]

    and Barto, Andrew G

    Sutton, Richard S. and Barto, Andrew G. , biburl =. Reinforcement Learning: An Introduction , url =

  32. [32]

    Operations Research , volume=

    Is Q-learning minimax optimal? a tight sample complexity analysis , author=. Operations Research , volume=. 2024 , publisher=

  33. [33]

    2024 , eprint=

    Statistical Inference for Temporal Difference Learning with Linear Function Approximation , author=. 2024 , eprint=

  34. [34]

    High-probability sample complexities for policy evaluation with linear function approximation , year=

    Li, Gen and Wu, Weichen and Chi, Yuejie and Ma, Cong and Rinaldo, Alessandro and Wei, Yuting , journal=. High-probability sample complexities for policy evaluation with linear function approximation , year=

  35. [35]

    2018 , eprint=

    Multivariate approximations in Wasserstein distance by Stein's method and Bismut's formula , author=. 2018 , eprint=

  36. [36]

    2018 , eprint=

    A Matrix Expander Chernoff Bound , author=. 2018 , eprint=

  37. [37]

    2019 , eprint=

    Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT , author=. 2019 , eprint=

  38. [38]

    Journal of Machine Learning Research , volume=

    Hoeffding’s Inequality for General Markov Chains and Its Applications to Statistical Learning , author=. Journal of Machine Learning Research , volume=. 2021 , url=

  39. [39]

    Conference on Learning Theory , pages=

    On linear stochastic approximation: Fine-grained Polyak-Ruppert and non-asymptotic concentration , author=. Conference on Learning Theory , pages=. 2020 , organization=

  40. [40]

    Statistical inference for model parameters in stochastic gradient descent , author=

  41. [41]

    On quantitative bounds in the mean martingale central limit theorem , volume=

    Röllin, Adrian , year=. On quantitative bounds in the mean martingale central limit theorem , volume=. doi:10.1016/j.spl.2018.03.004 , journal=

  42. [42]

    2023 , eprint=

    Finite-Sample Analysis of the Temporal Difference Learning , author=. 2023 , eprint=

  43. [43]

    High-dimensional CLT for Sums of Non-degenerate Random Vectors: n^

    Arun Kumar Kuchibhotla and Alessandro Rinaldo , year=. High-dimensional CLT for Sums of Non-degenerate Random Vectors: n^. 2009.13673 , archivePrefix=

  44. [44]

    2024 , eprint=

    Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning , author=. 2024 , eprint=

  45. [45]

    arXiv preprint arXiv:2207.04475 , year=

    Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation , author=. arXiv preprint arXiv:2207.04475 , year=

  46. [46]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Finite sample analyses for TD (0) with function approximation , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  47. [47]

    Probability Theory and Related Fields , volume=

    Some dimension-free features of vector-valued martingales , author=. Probability Theory and Related Fields , volume=. 1991 , publisher=

  48. [48]

    2021 , eprint=

    Multivariate normal approximation on the Wiener space: new bounds in the convex distance , author=. 2021 , eprint=

  49. [49]

    Tropp, Joel A. , year=. User-Friendly Tail Bounds for Sums of Random Matrices , volume=. Foundations of Computational Mathematics , publisher=. doi:10.1007/s10208-011-9099-z , number=

  50. [50]

    The total variation distance between high-dimensional gaussians with the same mean,

    The total variation distance between high-dimensional Gaussians , author=. arXiv preprint arXiv:1810.08693 , year=

  51. [51]

    Conference on Learning Theory , pages=

    Finite-time error bounds for linear stochastic approximation andtd learning , author=. Conference on Learning Theory , pages=. 2019 , organization=

  52. [52]

    2015 , eprint=

    Multivariate Normal Approximation by Stein's Method: The Concentration Inequality Approach , author=. 2015 , eprint=

  53. [53]

    2012 , eprint=

    Chernoff-Hoeffding Bounds for Markov Chains: Generalized and Simplified , author=. 2012 , eprint=

  54. [54]

    A Berry–Esseen bound for vector-valued martingales , volume=

    Kojevnikov, Denis and Song, Kyungchul , year=. A Berry–Esseen bound for vector-valued martingales , volume=. doi:10.1016/j.spl.2022.109448 , journal=

  55. [55]

    2024 , eprint=

    Revisiting Step-Size Assumptions in Stochastic Approximation , author=. 2024 , eprint=

  56. [56]

    2024 , eprint=

    Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning , author=. 2024 , eprint=

  57. [57]

    Asymptotic and finite-sample properties of estimators based on stochastic gradients , author=

  58. [58]

    A multivariate Berry–Esseen theorem with explicit constants , volume=

    Raič, Martin , year=. A multivariate Berry–Esseen theorem with explicit constants , volume=. Bernoulli , publisher=. doi:10.3150/18-bej1072 , number=

  59. [59]

    2013 , publisher=

    Markov chains: Gibbs fields, Monte Carlo simulation, and queues , author=. 2013 , publisher=

  60. [60]

    2020 , eprint=

    A Matrix Chernoff Bound for Markov Chains and Its Application to Co-occurrence Matrices , author=. 2020 , eprint=

  61. [61]

    SIAM journal on control and optimization , volume=

    Acceleration of stochastic approximation by averaging , author=. SIAM journal on control and optimization , volume=. 1992 , publisher=

  62. [62]

    The Annals of Mathematical Statistics , pages=

    On a stochastic approximation method , author=. The Annals of Mathematical Statistics , pages=. 1954 , publisher=

  63. [63]

    The annals of mathematical statistics , pages=

    A stochastic approximation method , author=. The annals of mathematical statistics , pages=. 1951 , publisher=

  64. [64]

    ESAIM: Probability and Statistics , volume=

    Central limit theorems for stochastic approximation with controlled Markov chain dynamics , author=. ESAIM: Probability and Statistics , volume=. 2015 , publisher=

  65. [65]

    Efficient estimators from a slowly converging robbins-monro process , author =

  66. [66]

    The Annals of Mathematical Statistics , pages=

    On asymptotic normality in stochastic approximation , author=. The Annals of Mathematical Statistics , pages=. 1968 , publisher=

  67. [67]

    IEEE Transactions on Automatic Control , volume=

    An Analysis of Temporal-Difference Learning with Function Approximation , author=. IEEE Transactions on Automatic Control , volume=

  68. [68]

    Operations Research , volume=

    A finite time analysis of temporal difference learning with linear function approximation , author=. Operations Research , volume=. 2021 , publisher=

  69. [69]

    2017 , publisher=

    Dynamic programming and optimal control (4th edition) , author=. 2017 , publisher=

  70. [70]

    Advances in Neural Information Processing Systems , volume=

    Interval estimation for reinforcement-learning algorithms in continuous-state domains , author=. Advances in Neural Information Processing Systems , volume=

  71. [71]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Bootstrapping with models: Confidence intervals for off-policy evaluation , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  72. [72]

    International Conference on Machine Learning , pages=

    Bootstrapping fitted q-evaluation for off-policy inference , author=. International Conference on Machine Learning , pages=. 2021 , organization=

  73. [73]

    Journal of the American Statistical Association , pages=

    Online bootstrap inference for policy evaluation in reinforcement learning , author=. Journal of the American Statistical Association , pages=. 2022 , publisher=

  74. [74]

    Online statistical inference for nonlinear stochastic approximation with Markovian data.arXiv preprint arXiv:2302.07690, 2023a

    Online Statistical Inference for Nonlinear Stochastic Approximation with Markovian Data , author=. arXiv preprint arXiv:2302.07690 , year=

  75. [75]

    Advances in Neural Information Processing Systems , volume=

    A Statistical Online Inference Approach in Averaged Stochastic Approximation , author=. Advances in Neural Information Processing Systems , volume=

  76. [76]

    arXiv preprint arXiv:2102.04923 , year=

    Berry--Esseen Bounds for Multivariate Nonlinear Statistics with Applications to M-estimators and Stochastic Gradient Descent Algorithms , author=. arXiv preprint arXiv:2102.04923 , year=

  77. [77]

    Annals of Applied Probability , volume =

    High-dimensional central limit theorems by Stein's method , author =. Annals of Applied Probability , volume =

  78. [78]

    2004 , eprint=

    Exact convergence rates in the central limit theorem for a class of martingales , author=. 2004 , eprint=

  79. [79]

    2025 , eprint=

    Uncertainty quantification for Markov chains with application to temporal difference learning , author=. 2025 , eprint=

  80. [80]

    Advances in neural information processing systems , volume=

    Non-asymptotic analysis of stochastic approximation algorithms for machine learning , author=. Advances in neural information processing systems , volume=

Showing first 80 references.