pith. sign in

arxiv: 2307.11632 · v3 · submitted 2023-07-21 · 🧮 math.PR · math.OA

Sharp concentration for sums of matrices with Markovian dependence through universality

Pith reviewed 2026-05-24 07:24 UTC · model grok-4.3

classification 🧮 math.PR math.OA
keywords random matricesMarkov chainsmatrix concentrationuniversalityspectral propertiesψ-mixingBoolean cumulants
0
0 comments X

The pith

Sums of matrices from ψ-mixing Markov chains concentrate like Gaussians.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a universality principle showing that sums of random matrices generated by a ψ-mixing Markov chain have spectral properties similar to those of a Gaussian matrix with the same mean and covariance. This allows the use of recent Gaussian concentration results to obtain sharp bounds in the Markovian setting. The approach relies on Boolean cumulants and a change-of-measure argument to overcome limitations of classical cumulants for dependent variables.

Core claim

A sum of random matrices generated by a ψ-mixing Markov chain has similar spectral properties to a Gaussian matrix with the same mean and covariance structure, enabling sharp concentration inequalities.

What carries the argument

Nonasymptotic universality principle for Markov-dependent matrix sums, established via Boolean cumulants and change-of-measure argument.

Load-bearing premise

The random matrices are generated by a ψ-mixing Markov chain.

What would settle it

A concrete ψ-mixing Markov chain for which the matrix sum's largest eigenvalue or spectral norm deviates from the Gaussian prediction by more than the claimed error term at finite dimensions.

Figures

Figures reproduced from arXiv: 2307.11632 by Alexander Van Werde, Jaron Sanders.

Figure 1
Figure 1. Figure 1: The bars display the empirical singular value distribution of the recentered and nor [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: This figure displays a permutation ρ ∈ S15. Every part in the induced partition of runs πρ here corresponds to a connected component of the solid blue lines by way of the y-values in this connected component. The displayed permutation is given by (ρ(1), ρ(2), . . ., ρ(15)) = (1, 14, 11, 2, 5, 6, 12, 4, 9, 10, 8, 13, 15, 3, 7). The induced partition of runs is given by πρ = {{1, 14}, {11}, {2, 5, 6, 12}, {4… view at source ↗
Figure 3
Figure 3. Figure 3: These figures visualize the transformation [PITH_FULL_IMAGE:figures/full_fig_p030_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: For the sake of simplicity, we assume in all results of this section that [PITH_FULL_IMAGE:figures/full_fig_p036_4.png] view at source ↗
read the original abstract

We prove that a sum of random matrices generated by a $\psi$-mixing Markov chain has similar spectral properties to a Gaussian matrix with the same mean and covariance structure. This nonasymptotic universality principle enables sharp concentration inequalities when combined with recent advances in the Gaussian literature. We illustrate the theory with examples, showing how it enables polynomial dimensional improvements relative to previous Markovian matrix concentration results when applied to Wigner-type matrices, and how one can recover sharp limiting values for a model used to study spectral clustering techniques. A key challenge in the proof is that techniques based only on classical cumulants, which can be used when summands are independent, are not sufficient on their own for efficient estimates in a Markovian setting. Our approach exploits Boolean cumulants and a change--of--measure argument.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript proves that a sum of random matrices generated by a ψ-mixing Markov chain has similar spectral properties to a Gaussian matrix with the same mean and covariance structure. This nonasymptotic universality principle is established using Boolean cumulants and a change-of-measure argument, enabling sharp concentration inequalities. The theory is illustrated with examples on Wigner-type matrices, showing polynomial dimensional improvements, and on a model for spectral clustering techniques.

Significance. If the central result holds, the paper offers a valuable contribution to random matrix theory by extending universality and concentration results from the independent or Gaussian case to Markov-dependent settings. The approach circumvents issues with classical cumulants in dependent environments. The applications demonstrate practical improvements over prior Markovian matrix concentration bounds.

minor comments (2)
  1. [Abstract] The phrase 'recent advances in the Gaussian literature' lacks specific references; these should be cited explicitly in the introduction.
  2. The definition and properties of ψ-mixing should be recalled or referenced in the main text for readers unfamiliar with the concept.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their careful reading and positive evaluation of the manuscript. The recommendation for minor revision is noted, and we will incorporate any editorial suggestions in the revised version.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper establishes a non-asymptotic universality result mapping spectral properties of ψ-mixing Markovian matrix sums to a Gaussian counterpart via Boolean cumulants combined with a change-of-measure argument. This is presented as an independent proof technique that extends beyond classical cumulants, without any reduction of the central claim to fitted parameters, self-definitional relations, or load-bearing self-citations. The derivation chain is self-contained against external benchmarks in the Gaussian literature and does not invoke uniqueness theorems or ansatzes from prior author work as the sole justification. No steps meet the criteria for circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The result rests on the ψ-mixing assumption for the Markov chain and the applicability of Boolean cumulants to the matrix setting; no free parameters or invented entities are mentioned.

axioms (1)
  • domain assumption The Markov chain satisfies the ψ-mixing condition
    Required for the dependence structure in the universality statement.

pith-pipeline@v0.9.0 · 5659 in / 1055 out tokens · 21945 ms · 2026-05-24T07:24:27.352438+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 1 internal anchor

  1. [1]

    AMOSOVA , N. N. (2002). Necessity of the Cramer, Linnik, and Statuleviˇcius Conditions for Large-Deviation Probabilities. Journal of Mathematical Sciences

  2. [2]

    ANDERSON , G. W. , GUIONNET , A. and ZEITOUNI , O. (2010). An introduction to random matrices. Cam- bridge University Press

  3. [3]

    , HASEBE , T

    ARIZMENDI , O. , HASEBE , T. , LEHNER , F. and VARGAS , C. (2015). Relations between cumulants in noncommutative probability. Advances in Mathematics

  4. [4]

    , GAÏFFAS , S

    BACRY, E. , GAÏFFAS , S. and MUZY, J. F. (2018). Concentration inequalities for matrix martingales in continuous time. Probability Theory and Related Fields

  5. [5]

    and SILVERSTEIN , J

    BAI, Z. and SILVERSTEIN , J. W. (2010). Spectral analysis of large dimensional random matrices. Springer

  6. [6]

    BAI, Z. D. and YIN, Y. Q. (1988). Necessary and sufficient conditions for almost sure convergence of the largest eigenvalue of a Wigner matrix. The Annals of Probability

  7. [7]

    BANDEIRA , A. S. , BOEDIHARDJO , M. T. and VAN HANDEL , R. (2023). Matrix concentration inequalities and free probability. Inventiones Mathematicae

  8. [8]

    , MERLEVÈDE , F

    BANNA , M. , MERLEVÈDE , F. and YOUSSEF , P. (2016). Bernstein-type inequality for a class of dependent random matrices. Random Matrices: Theory and Applications

  9. [9]

    B ILLINGSLEY , P. (2008). Probability and measure, Third ed. John Wiley & Sons

  10. [10]

    BLUM , J. R. , HANSON , D. L. and KOOPMANS , L. H. (1963). On the Strong Law of Large numbers for a Class of Stochastic Processes. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete

  11. [11]

    and VAN HANDEL , R

    BRAILOVSKAYA , T. and VAN HANDEL , R. (2022). Universality and sharp matrix concentration inequalities. arXiv:2201.05142

  12. [12]

    and MANDJES , M

    DIONIGI , P., GARLASCHELLI , D., DEN HOLLANDER , F. and MANDJES , M. (2021). A spectral signature of breaking of ensemble equivalence for constrained random graphs. Electronic Communications in Probability

  13. [13]

    and S CHUBERT , K

    D ÖRING , H., J ANSEN , S. and S CHUBERT , K. (2022). The method of cumulants for the normal approxima- tion. Probability Surveys

  14. [14]

    D UDLEY , R. M. (2002). Real analysis and probability. Cambridge University Press

  15. [15]

    EATON, M. L. (1983). Multivariate statistics: a vector space approach.Institute of Mathematical Statistics

  16. [16]

    F ÉRAY, V. (2018). Weighted dependency graphs. Electronic Journal of Probability

  17. [17]

    G ANTMACHER , F. R. (2000). The theory of matrices 2. American Mathematical Society

  18. [18]

    , LEE, Y

    GARG , A. , LEE, Y. T., SONG , Z. and SRIVASTAVA, N. (2018). A Matrix Expander Chernoff Bound. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC’18)

  19. [19]

    Bernoulli Random Matrices

    GUIONNET , A. Bernoulli Random Matrices. Proceedings of the 8th European Congress in Mathematics. To appear, arXiv:2112.05506

  20. [20]

    and LI, Y

    HAN, F. and LI, Y. (2020). Moment bounds for large autocovariance matrices under dependence. Journal of Theoretical Probability

  21. [21]

    HELTON , J. W. , FAR, R. R. and SPEICHER , R. (2007). Operator-valued semicircular elements: solving a quadratic matrix equation with positivity constraints. International Mathematics Research Notices

  22. [22]

    H ORN , R. A. and J OHNSON , C. R. (2012). Matrix analysis, second ed. Cambridge University Press

  23. [23]

    and LINNIK , Y

    IBRAGIMOV , I. and LINNIK , Y. V. (1975). Independent and stationary sequences of random variables . Wolters, Noordhoff Publishing Groningen. 40

  24. [24]

    JANSON , S. (2020). Asymptotic normality in random graphs with given vertex degrees. Random Structures & Algorithms

  25. [25]

    , PROUTIÈRE , A

    JEDRA , Y., LEE, J. , PROUTIÈRE , A. and YUN, S. Y. (2023). Nearly Optimal Latent State Decoding in Block MDPs. In International Conference on Artificial Intelligence and Statistics

  26. [26]

    LEHNER , F. (1999). Computing norms of free operators with matrix coefficients. American Journal of Mathematics

  27. [27]

    LEVIN , D. A. , PERES , Y. and WILMER , E. L. (2017). Markov Chains and Mixing Times: Second Edition. American Mathematical Society

  28. [28]

    LUST-PIQUARD , F. (1986). Inégalités de Khintchine dans Cp (1 < p < ∞). Comptes Rendus de l’Académie des Sciences Paris

  29. [29]

    , JORDAN , M

    MACKEY, L. , JORDAN , M. I. , CHEN , R. Y., FARRELL , B. and TROPP, J. A. (2014). Matrix concentration inequalities via the method of exchangeable pairs. The Annals of Probability

  30. [30]

    , SHI, B

    NEEMAN , J. , SHI, B. and WARD , R. (2023). Concentration inequalities for sums of Markov dependent random matrices. arXiv:2303.02150

  31. [31]

    N ELSON , E. (1974). Notes on non-commutative integration. Journal of Functional Analysis

  32. [32]

    and SPEICHER , R

    NICA , A. and SPEICHER , R. (2006). Lectures on the Combinatorics of Free Probability. Cambridge Univer- sity Press

  33. [33]

    OLIVEIRA , R. I. (2009). Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. arXiv preprint arXiv:0911.0600

  34. [34]

    , MACKEY, L

    PAULIN , D. , MACKEY, L. and TROPP, J. A. (2016). Efron–Stein inequalities for random matrices. The Annals of Probability

  35. [35]

    P ISIER , G. (2003). Introduction to operator space theory. Cambridge University Press

  36. [36]

    and TANG , J

    QIU, J., WANG , C., LIAO, B., PENG , R. and TANG , J. (2020). A Matrix Chernoff Bound for Markov Chains and Its Application to Co-occurrence Matrices. Advances in Neural Information Processing Systems

  37. [37]

    R OBBINS , H. (1955). A remark on Stirling’s formula. The American mathematical monthly

  38. [38]

    , PROUTIÈRE , A

    SANDERS , J. , PROUTIÈRE , A. and YUN, S. Y. (2020). Clustering in block Markov chains. The Annals of Statistics

  39. [39]

    and SENEN -C ERDA , A

    SANDERS , J. and SENEN -C ERDA , A. (2023). Spectral norm bounds for block Markov chain random matri- ces. Stochastic Processes and their Applications

  40. [40]

    and VAN WERDE , A

    SANDERS , J. and VAN WERDE , A. (2023). Singular value distribution of dense random matrices with block Markovian dependence. Stochastic Processes and their Applications

  41. [41]

    and STATULEVI ˇCIUS , V

    SAULIS , L. and STATULEVI ˇCIUS , V. A. (1991). Limit theorems for large deviations. Springer Science & Business Media

  42. [42]

    S PEED , T. P. (1983). Cumulants and partition lattices 1. Australian Journal of Statistics

  43. [43]

    and W OROUDI , R

    S PEICHER , R. and W OROUDI , R. (1997). Boolean convolutions. Fields Institute Communications

  44. [44]

    (1969, 1970)

    STATULEVI ˇCIUS , V. (1969, 1970). Limit theorems for the sums of random variables related to a Markov chain. I, II, III. Lithuanian Mathematical Journal

  45. [45]

    T AO, T. (2012). Topics in random matrix theory. American Mathematical Soc

  46. [46]

    TROPP, J. A. (2011). Freedman’s inequality for matrix martingales.Electronic Communications in Proba- bility

  47. [47]

    TROPP, J. A. (2018). Second-order matrix concentration inequalities. Applied and Computational Harmonic Analysis

  48. [48]

    and SANDERS , J

    VAN WERDE , A., SENEN CERDA , A., KOSMELLA , G. and SANDERS , J. (2022). Detection and Evaluation of Clusters within Sequential Data. arXiv preprint arXiv:2210.01679

  49. [49]

    , GIRARD , S

    VLADIMIROVA , M. , GIRARD , S. , NGUYEN , H. and ARBEL , J. (2020). Sub-Weibull distributions: General- izing sub-Gaussian and sub-Exponential properties to heavier tailed distributions. Stat

  50. [50]

    and WANG , M

    ZHANG , A. and WANG , M. (2019). Spectral state compression of Markov processes. IEEE Transactions on Information Theory

  51. [51]

    and W EI, H

    Z HANG , H. and W EI, H. (2022). Sharper sub-Weibull concentrations. Mathematics. 41 APPENDIX A: PROOFS FOR THE MATRIX SERIES MODEL We here provide the details for the proofs of the universality of tracial moments and the variance thereof in the matrix series model. Recall that a sketch of the proof approach was provided in Section 5.5. A.1. Universality ...

  52. [52]

    Consequently, using a similar estimate for the case t1 = t2 + 1, d n X |t1−t2|=1 Et1,t2(i, j, k, m) ≤ ( 2 d3 c2 2, if j ̸= k and i ̸= m, 2 d2 (c3 + 1 d c2

    if j = k. Consequently, using a similar estimate for the case t1 = t2 + 1, d n X |t1−t2|=1 Et1,t2(i, j, k, m) ≤ ( 2 d3 c2 2, if j ̸= k and i ̸= m, 2 d2 (c3 + 1 d c2

  53. [53]

    (D.13) Finally, consider the case where |t1 − t2| > 1

    if j = k or i = m. (D.13) Finally, consider the case where |t1 − t2| > 1. Then, if t2 > t1 + 1 d n n−3X t1=1 X t2>t1+1 Et1,t2(i, j, k, m) = d n n−2X s=2 (n − 1 − s)E1,1+s(i, j, k, m) (D.14) 52 = dP(E1 = (i, j)) n−2X s=2 n − 1 − s n P(E1+s = (k, m) | Z2 = j) − P(E1+s = (k, m)) = dP(E1 = (i, j))P(Z2+s = m | Z1+s = k) × n−2X s=2 n − 1 − s n P(Z1+s = k | Z2 =...