Sharp concentration for sums of matrices with Markovian dependence through universality

Alexander Van Werde; Jaron Sanders

arxiv: 2307.11632 · v3 · submitted 2023-07-21 · 🧮 math.PR · math.OA

Sharp concentration for sums of matrices with Markovian dependence through universality

Alexander Van Werde , Jaron Sanders This is my paper

Pith reviewed 2026-05-24 07:24 UTC · model grok-4.3

classification 🧮 math.PR math.OA

keywords random matricesMarkov chainsmatrix concentrationuniversalityspectral propertiesψ-mixingBoolean cumulants

0 comments

The pith

Sums of matrices from ψ-mixing Markov chains concentrate like Gaussians.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a universality principle showing that sums of random matrices generated by a ψ-mixing Markov chain have spectral properties similar to those of a Gaussian matrix with the same mean and covariance. This allows the use of recent Gaussian concentration results to obtain sharp bounds in the Markovian setting. The approach relies on Boolean cumulants and a change-of-measure argument to overcome limitations of classical cumulants for dependent variables.

Core claim

A sum of random matrices generated by a ψ-mixing Markov chain has similar spectral properties to a Gaussian matrix with the same mean and covariance structure, enabling sharp concentration inequalities.

What carries the argument

Nonasymptotic universality principle for Markov-dependent matrix sums, established via Boolean cumulants and change-of-measure argument.

Load-bearing premise

The random matrices are generated by a ψ-mixing Markov chain.

What would settle it

A concrete ψ-mixing Markov chain for which the matrix sum's largest eigenvalue or spectral norm deviates from the Gaussian prediction by more than the claimed error term at finite dimensions.

Figures

Figures reproduced from arXiv: 2307.11632 by Alexander Van Werde, Jaron Sanders.

**Figure 2.** Figure 2: This figure displays a permutation ρ ∈ S15. Every part in the induced partition of runs πρ here corresponds to a connected component of the solid blue lines by way of the y-values in this connected component. The displayed permutation is given by (ρ(1), ρ(2), . . ., ρ(15)) = (1, 14, 11, 2, 5, 6, 12, 4, 9, 10, 8, 13, 15, 3, 7). The induced partition of runs is given by πρ = {{1, 14}, {11}, {2, 5, 6, 12}, {4… view at source ↗

**Figure 3.** Figure 3: These figures visualize the transformation [PITH_FULL_IMAGE:figures/full_fig_p030_3.png] view at source ↗

**Figure 4.** Figure 4: For the sake of simplicity, we assume in all results of this section that [PITH_FULL_IMAGE:figures/full_fig_p036_4.png] view at source ↗

read the original abstract

We prove that a sum of random matrices generated by a $\psi$-mixing Markov chain has similar spectral properties to a Gaussian matrix with the same mean and covariance structure. This nonasymptotic universality principle enables sharp concentration inequalities when combined with recent advances in the Gaussian literature. We illustrate the theory with examples, showing how it enables polynomial dimensional improvements relative to previous Markovian matrix concentration results when applied to Wigner-type matrices, and how one can recover sharp limiting values for a model used to study spectral clustering techniques. A key challenge in the proof is that techniques based only on classical cumulants, which can be used when summands are independent, are not sufficient on their own for efficient estimates in a Markovian setting. Our approach exploits Boolean cumulants and a change--of--measure argument.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a universality result for ψ-mixing Markov matrix sums via Boolean cumulants and change of measure, which could transfer Gaussian bounds and improve dimension dependence, but the key estimates remain unshown.

read the letter

The central claim is that sums of random matrices from a ψ-mixing Markov chain have spectral properties close to a matching Gaussian, at non-asymptotic scales. This is obtained by switching to Boolean cumulants, which the authors say handle the dependence better than classical ones, plus a change-of-measure step. The examples then show polynomial dimension gains over prior Markov bounds for Wigner-type matrices and recover sharp limits in a spectral clustering model. That combination is the actual novelty here, and it targets a documented gap where independent-sum techniques stop working efficiently. No circularity or invented entities appear in the framing. The stress-test note also found no internal inconsistency in the stated reduction. The soft spot is that the abstract supplies no derivation or error control for the Boolean cumulant estimates themselves, so it is impossible to check whether the claimed sharpness survives the dependence without extra logarithmic or dimensional costs. If those bounds close cleanly in the full text, the result stands; otherwise the improvement may be more modest. This is written for people already working on non-asymptotic matrix concentration with dependence. A reader who needs better Markovian bounds or who follows universality methods would get direct value from the examples and the technique. It is worth sending to a serious referee because the target application is concrete and the method is not a routine extension.

Referee Report

0 major / 2 minor

Summary. The manuscript proves that a sum of random matrices generated by a ψ-mixing Markov chain has similar spectral properties to a Gaussian matrix with the same mean and covariance structure. This nonasymptotic universality principle is established using Boolean cumulants and a change-of-measure argument, enabling sharp concentration inequalities. The theory is illustrated with examples on Wigner-type matrices, showing polynomial dimensional improvements, and on a model for spectral clustering techniques.

Significance. If the central result holds, the paper offers a valuable contribution to random matrix theory by extending universality and concentration results from the independent or Gaussian case to Markov-dependent settings. The approach circumvents issues with classical cumulants in dependent environments. The applications demonstrate practical improvements over prior Markovian matrix concentration bounds.

minor comments (2)

[Abstract] The phrase 'recent advances in the Gaussian literature' lacks specific references; these should be cited explicitly in the introduction.
The definition and properties of ψ-mixing should be recalled or referenced in the main text for readers unfamiliar with the concept.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their careful reading and positive evaluation of the manuscript. The recommendation for minor revision is noted, and we will incorporate any editorial suggestions in the revised version.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper establishes a non-asymptotic universality result mapping spectral properties of ψ-mixing Markovian matrix sums to a Gaussian counterpart via Boolean cumulants combined with a change-of-measure argument. This is presented as an independent proof technique that extends beyond classical cumulants, without any reduction of the central claim to fitted parameters, self-definitional relations, or load-bearing self-citations. The derivation chain is self-contained against external benchmarks in the Gaussian literature and does not invoke uniqueness theorems or ansatzes from prior author work as the sole justification. No steps meet the criteria for circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The result rests on the ψ-mixing assumption for the Markov chain and the applicability of Boolean cumulants to the matrix setting; no free parameters or invented entities are mentioned.

axioms (1)

domain assumption The Markov chain satisfies the ψ-mixing condition
Required for the dependence structure in the universality statement.

pith-pipeline@v0.9.0 · 5659 in / 1055 out tokens · 21945 ms · 2026-05-24T07:24:27.352438+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages · 1 internal anchor

[1]

AMOSOVA , N. N. (2002). Necessity of the Cramer, Linnik, and Statuleviˇcius Conditions for Large-Deviation Probabilities. Journal of Mathematical Sciences

work page 2002
[2]

ANDERSON , G. W. , GUIONNET , A. and ZEITOUNI , O. (2010). An introduction to random matrices. Cam- bridge University Press

work page 2010
[3]

, HASEBE , T

ARIZMENDI , O. , HASEBE , T. , LEHNER , F. and VARGAS , C. (2015). Relations between cumulants in noncommutative probability. Advances in Mathematics

work page 2015
[4]

, GAÏFFAS , S

BACRY, E. , GAÏFFAS , S. and MUZY, J. F. (2018). Concentration inequalities for matrix martingales in continuous time. Probability Theory and Related Fields

work page 2018
[5]

and SILVERSTEIN , J

BAI, Z. and SILVERSTEIN , J. W. (2010). Spectral analysis of large dimensional random matrices. Springer

work page 2010
[6]

BAI, Z. D. and YIN, Y. Q. (1988). Necessary and sufficient conditions for almost sure convergence of the largest eigenvalue of a Wigner matrix. The Annals of Probability

work page 1988
[7]

BANDEIRA , A. S. , BOEDIHARDJO , M. T. and VAN HANDEL , R. (2023). Matrix concentration inequalities and free probability. Inventiones Mathematicae

work page 2023
[8]

, MERLEVÈDE , F

BANNA , M. , MERLEVÈDE , F. and YOUSSEF , P. (2016). Bernstein-type inequality for a class of dependent random matrices. Random Matrices: Theory and Applications

work page 2016
[9]

B ILLINGSLEY , P. (2008). Probability and measure, Third ed. John Wiley & Sons

work page 2008
[10]

BLUM , J. R. , HANSON , D. L. and KOOPMANS , L. H. (1963). On the Strong Law of Large numbers for a Class of Stochastic Processes. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete

work page 1963
[11]

and VAN HANDEL , R

BRAILOVSKAYA , T. and VAN HANDEL , R. (2022). Universality and sharp matrix concentration inequalities. arXiv:2201.05142

work page arXiv 2022
[12]

and MANDJES , M

DIONIGI , P., GARLASCHELLI , D., DEN HOLLANDER , F. and MANDJES , M. (2021). A spectral signature of breaking of ensemble equivalence for constrained random graphs. Electronic Communications in Probability

work page 2021
[13]

and S CHUBERT , K

D ÖRING , H., J ANSEN , S. and S CHUBERT , K. (2022). The method of cumulants for the normal approxima- tion. Probability Surveys

work page 2022
[14]

D UDLEY , R. M. (2002). Real analysis and probability. Cambridge University Press

work page 2002
[15]

EATON, M. L. (1983). Multivariate statistics: a vector space approach.Institute of Mathematical Statistics

work page 1983
[16]

F ÉRAY, V. (2018). Weighted dependency graphs. Electronic Journal of Probability

work page 2018
[17]

G ANTMACHER , F. R. (2000). The theory of matrices 2. American Mathematical Society

work page 2000
[18]

, LEE, Y

GARG , A. , LEE, Y. T., SONG , Z. and SRIVASTAVA, N. (2018). A Matrix Expander Chernoff Bound. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC’18)

work page 2018
[19]

Bernoulli Random Matrices

GUIONNET , A. Bernoulli Random Matrices. Proceedings of the 8th European Congress in Mathematics. To appear, arXiv:2112.05506

work page arXiv
[20]

and LI, Y

HAN, F. and LI, Y. (2020). Moment bounds for large autocovariance matrices under dependence. Journal of Theoretical Probability

work page 2020
[21]

HELTON , J. W. , FAR, R. R. and SPEICHER , R. (2007). Operator-valued semicircular elements: solving a quadratic matrix equation with positivity constraints. International Mathematics Research Notices

work page 2007
[22]

H ORN , R. A. and J OHNSON , C. R. (2012). Matrix analysis, second ed. Cambridge University Press

work page 2012
[23]

and LINNIK , Y

IBRAGIMOV , I. and LINNIK , Y. V. (1975). Independent and stationary sequences of random variables . Wolters, Noordhoff Publishing Groningen. 40

work page 1975
[24]

JANSON , S. (2020). Asymptotic normality in random graphs with given vertex degrees. Random Structures & Algorithms

work page 2020
[25]

, PROUTIÈRE , A

JEDRA , Y., LEE, J. , PROUTIÈRE , A. and YUN, S. Y. (2023). Nearly Optimal Latent State Decoding in Block MDPs. In International Conference on Artificial Intelligence and Statistics

work page 2023
[26]

LEHNER , F. (1999). Computing norms of free operators with matrix coefficients. American Journal of Mathematics

work page 1999
[27]

LEVIN , D. A. , PERES , Y. and WILMER , E. L. (2017). Markov Chains and Mixing Times: Second Edition. American Mathematical Society

work page 2017
[28]

LUST-PIQUARD , F. (1986). Inégalités de Khintchine dans Cp (1 < p < ∞). Comptes Rendus de l’Académie des Sciences Paris

work page 1986
[29]

, JORDAN , M

MACKEY, L. , JORDAN , M. I. , CHEN , R. Y., FARRELL , B. and TROPP, J. A. (2014). Matrix concentration inequalities via the method of exchangeable pairs. The Annals of Probability

work page 2014
[30]

, SHI, B

NEEMAN , J. , SHI, B. and WARD , R. (2023). Concentration inequalities for sums of Markov dependent random matrices. arXiv:2303.02150

work page arXiv 2023
[31]

N ELSON , E. (1974). Notes on non-commutative integration. Journal of Functional Analysis

work page 1974
[32]

and SPEICHER , R

NICA , A. and SPEICHER , R. (2006). Lectures on the Combinatorics of Free Probability. Cambridge Univer- sity Press

work page 2006
[33]

OLIVEIRA , R. I. (2009). Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. arXiv preprint arXiv:0911.0600

work page internal anchor Pith review Pith/arXiv arXiv 2009
[34]

, MACKEY, L

PAULIN , D. , MACKEY, L. and TROPP, J. A. (2016). Efron–Stein inequalities for random matrices. The Annals of Probability

work page 2016
[35]

P ISIER , G. (2003). Introduction to operator space theory. Cambridge University Press

work page 2003
[36]

and TANG , J

QIU, J., WANG , C., LIAO, B., PENG , R. and TANG , J. (2020). A Matrix Chernoff Bound for Markov Chains and Its Application to Co-occurrence Matrices. Advances in Neural Information Processing Systems

work page 2020
[37]

R OBBINS , H. (1955). A remark on Stirling’s formula. The American mathematical monthly

work page 1955
[38]

, PROUTIÈRE , A

SANDERS , J. , PROUTIÈRE , A. and YUN, S. Y. (2020). Clustering in block Markov chains. The Annals of Statistics

work page 2020
[39]

and SENEN -C ERDA , A

SANDERS , J. and SENEN -C ERDA , A. (2023). Spectral norm bounds for block Markov chain random matri- ces. Stochastic Processes and their Applications

work page 2023
[40]

and VAN WERDE , A

SANDERS , J. and VAN WERDE , A. (2023). Singular value distribution of dense random matrices with block Markovian dependence. Stochastic Processes and their Applications

work page 2023
[41]

and STATULEVI ˇCIUS , V

SAULIS , L. and STATULEVI ˇCIUS , V. A. (1991). Limit theorems for large deviations. Springer Science & Business Media

work page 1991
[42]

S PEED , T. P. (1983). Cumulants and partition lattices 1. Australian Journal of Statistics

work page 1983
[43]

and W OROUDI , R

S PEICHER , R. and W OROUDI , R. (1997). Boolean convolutions. Fields Institute Communications

work page 1997
[44]

(1969, 1970)

STATULEVI ˇCIUS , V. (1969, 1970). Limit theorems for the sums of random variables related to a Markov chain. I, II, III. Lithuanian Mathematical Journal

work page 1969
[45]

T AO, T. (2012). Topics in random matrix theory. American Mathematical Soc

work page 2012
[46]

TROPP, J. A. (2011). Freedman’s inequality for matrix martingales.Electronic Communications in Proba- bility

work page 2011
[47]

TROPP, J. A. (2018). Second-order matrix concentration inequalities. Applied and Computational Harmonic Analysis

work page 2018
[48]

and SANDERS , J

VAN WERDE , A., SENEN CERDA , A., KOSMELLA , G. and SANDERS , J. (2022). Detection and Evaluation of Clusters within Sequential Data. arXiv preprint arXiv:2210.01679

work page arXiv 2022
[49]

, GIRARD , S

VLADIMIROVA , M. , GIRARD , S. , NGUYEN , H. and ARBEL , J. (2020). Sub-Weibull distributions: General- izing sub-Gaussian and sub-Exponential properties to heavier tailed distributions. Stat

work page 2020
[50]

and WANG , M

ZHANG , A. and WANG , M. (2019). Spectral state compression of Markov processes. IEEE Transactions on Information Theory

work page 2019
[51]

and W EI, H

Z HANG , H. and W EI, H. (2022). Sharper sub-Weibull concentrations. Mathematics. 41 APPENDIX A: PROOFS FOR THE MATRIX SERIES MODEL We here provide the details for the proofs of the universality of tracial moments and the variance thereof in the matrix series model. Recall that a sketch of the proof approach was provided in Section 5.5. A.1. Universality ...

work page 2022
[52]

Consequently, using a similar estimate for the case t1 = t2 + 1, d n X |t1−t2|=1 Et1,t2(i, j, k, m) ≤ ( 2 d3 c2 2, if j ̸= k and i ̸= m, 2 d2 (c3 + 1 d c2

if j = k. Consequently, using a similar estimate for the case t1 = t2 + 1, d n X |t1−t2|=1 Et1,t2(i, j, k, m) ≤ ( 2 d3 c2 2, if j ̸= k and i ̸= m, 2 d2 (c3 + 1 d c2

work page
[53]

(D.13) Finally, consider the case where |t1 − t2| > 1

if j = k or i = m. (D.13) Finally, consider the case where |t1 − t2| > 1. Then, if t2 > t1 + 1 d n n−3X t1=1 X t2>t1+1 Et1,t2(i, j, k, m) = d n n−2X s=2 (n − 1 − s)E1,1+s(i, j, k, m) (D.14) 52 = dP(E1 = (i, j)) n−2X s=2 n − 1 − s n P(E1+s = (k, m) | Z2 = j) − P(E1+s = (k, m)) = dP(E1 = (i, j))P(Z2+s = m | Z1+s = k) × n−2X s=2 n − 1 − s n P(Z1+s = k | Z2 =...

work page 2033

[1] [1]

AMOSOVA , N. N. (2002). Necessity of the Cramer, Linnik, and Statuleviˇcius Conditions for Large-Deviation Probabilities. Journal of Mathematical Sciences

work page 2002

[2] [2]

ANDERSON , G. W. , GUIONNET , A. and ZEITOUNI , O. (2010). An introduction to random matrices. Cam- bridge University Press

work page 2010

[3] [3]

, HASEBE , T

ARIZMENDI , O. , HASEBE , T. , LEHNER , F. and VARGAS , C. (2015). Relations between cumulants in noncommutative probability. Advances in Mathematics

work page 2015

[4] [4]

, GAÏFFAS , S

BACRY, E. , GAÏFFAS , S. and MUZY, J. F. (2018). Concentration inequalities for matrix martingales in continuous time. Probability Theory and Related Fields

work page 2018

[5] [5]

and SILVERSTEIN , J

BAI, Z. and SILVERSTEIN , J. W. (2010). Spectral analysis of large dimensional random matrices. Springer

work page 2010

[6] [6]

BAI, Z. D. and YIN, Y. Q. (1988). Necessary and sufficient conditions for almost sure convergence of the largest eigenvalue of a Wigner matrix. The Annals of Probability

work page 1988

[7] [7]

BANDEIRA , A. S. , BOEDIHARDJO , M. T. and VAN HANDEL , R. (2023). Matrix concentration inequalities and free probability. Inventiones Mathematicae

work page 2023

[8] [8]

, MERLEVÈDE , F

BANNA , M. , MERLEVÈDE , F. and YOUSSEF , P. (2016). Bernstein-type inequality for a class of dependent random matrices. Random Matrices: Theory and Applications

work page 2016

[9] [9]

B ILLINGSLEY , P. (2008). Probability and measure, Third ed. John Wiley & Sons

work page 2008

[10] [10]

BLUM , J. R. , HANSON , D. L. and KOOPMANS , L. H. (1963). On the Strong Law of Large numbers for a Class of Stochastic Processes. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete

work page 1963

[11] [11]

and VAN HANDEL , R

BRAILOVSKAYA , T. and VAN HANDEL , R. (2022). Universality and sharp matrix concentration inequalities. arXiv:2201.05142

work page arXiv 2022

[12] [12]

and MANDJES , M

DIONIGI , P., GARLASCHELLI , D., DEN HOLLANDER , F. and MANDJES , M. (2021). A spectral signature of breaking of ensemble equivalence for constrained random graphs. Electronic Communications in Probability

work page 2021

[13] [13]

and S CHUBERT , K

D ÖRING , H., J ANSEN , S. and S CHUBERT , K. (2022). The method of cumulants for the normal approxima- tion. Probability Surveys

work page 2022

[14] [14]

D UDLEY , R. M. (2002). Real analysis and probability. Cambridge University Press

work page 2002

[15] [15]

EATON, M. L. (1983). Multivariate statistics: a vector space approach.Institute of Mathematical Statistics

work page 1983

[16] [16]

F ÉRAY, V. (2018). Weighted dependency graphs. Electronic Journal of Probability

work page 2018

[17] [17]

G ANTMACHER , F. R. (2000). The theory of matrices 2. American Mathematical Society

work page 2000

[18] [18]

, LEE, Y

GARG , A. , LEE, Y. T., SONG , Z. and SRIVASTAVA, N. (2018). A Matrix Expander Chernoff Bound. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC’18)

work page 2018

[19] [19]

Bernoulli Random Matrices

GUIONNET , A. Bernoulli Random Matrices. Proceedings of the 8th European Congress in Mathematics. To appear, arXiv:2112.05506

work page arXiv

[20] [20]

and LI, Y

HAN, F. and LI, Y. (2020). Moment bounds for large autocovariance matrices under dependence. Journal of Theoretical Probability

work page 2020

[21] [21]

HELTON , J. W. , FAR, R. R. and SPEICHER , R. (2007). Operator-valued semicircular elements: solving a quadratic matrix equation with positivity constraints. International Mathematics Research Notices

work page 2007

[22] [22]

H ORN , R. A. and J OHNSON , C. R. (2012). Matrix analysis, second ed. Cambridge University Press

work page 2012

[23] [23]

and LINNIK , Y

IBRAGIMOV , I. and LINNIK , Y. V. (1975). Independent and stationary sequences of random variables . Wolters, Noordhoff Publishing Groningen. 40

work page 1975

[24] [24]

JANSON , S. (2020). Asymptotic normality in random graphs with given vertex degrees. Random Structures & Algorithms

work page 2020

[25] [25]

, PROUTIÈRE , A

JEDRA , Y., LEE, J. , PROUTIÈRE , A. and YUN, S. Y. (2023). Nearly Optimal Latent State Decoding in Block MDPs. In International Conference on Artificial Intelligence and Statistics

work page 2023

[26] [26]

LEHNER , F. (1999). Computing norms of free operators with matrix coefficients. American Journal of Mathematics

work page 1999

[27] [27]

LEVIN , D. A. , PERES , Y. and WILMER , E. L. (2017). Markov Chains and Mixing Times: Second Edition. American Mathematical Society

work page 2017

[28] [28]

LUST-PIQUARD , F. (1986). Inégalités de Khintchine dans Cp (1 < p < ∞). Comptes Rendus de l’Académie des Sciences Paris

work page 1986

[29] [29]

, JORDAN , M

MACKEY, L. , JORDAN , M. I. , CHEN , R. Y., FARRELL , B. and TROPP, J. A. (2014). Matrix concentration inequalities via the method of exchangeable pairs. The Annals of Probability

work page 2014

[30] [30]

, SHI, B

NEEMAN , J. , SHI, B. and WARD , R. (2023). Concentration inequalities for sums of Markov dependent random matrices. arXiv:2303.02150

work page arXiv 2023

[31] [31]

N ELSON , E. (1974). Notes on non-commutative integration. Journal of Functional Analysis

work page 1974

[32] [32]

and SPEICHER , R

NICA , A. and SPEICHER , R. (2006). Lectures on the Combinatorics of Free Probability. Cambridge Univer- sity Press

work page 2006

[33] [33]

OLIVEIRA , R. I. (2009). Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. arXiv preprint arXiv:0911.0600

work page internal anchor Pith review Pith/arXiv arXiv 2009

[34] [34]

, MACKEY, L

PAULIN , D. , MACKEY, L. and TROPP, J. A. (2016). Efron–Stein inequalities for random matrices. The Annals of Probability

work page 2016

[35] [35]

P ISIER , G. (2003). Introduction to operator space theory. Cambridge University Press

work page 2003

[36] [36]

and TANG , J

QIU, J., WANG , C., LIAO, B., PENG , R. and TANG , J. (2020). A Matrix Chernoff Bound for Markov Chains and Its Application to Co-occurrence Matrices. Advances in Neural Information Processing Systems

work page 2020

[37] [37]

R OBBINS , H. (1955). A remark on Stirling’s formula. The American mathematical monthly

work page 1955

[38] [38]

, PROUTIÈRE , A

SANDERS , J. , PROUTIÈRE , A. and YUN, S. Y. (2020). Clustering in block Markov chains. The Annals of Statistics

work page 2020

[39] [39]

and SENEN -C ERDA , A

SANDERS , J. and SENEN -C ERDA , A. (2023). Spectral norm bounds for block Markov chain random matri- ces. Stochastic Processes and their Applications

work page 2023

[40] [40]

and VAN WERDE , A

SANDERS , J. and VAN WERDE , A. (2023). Singular value distribution of dense random matrices with block Markovian dependence. Stochastic Processes and their Applications

work page 2023

[41] [41]

and STATULEVI ˇCIUS , V

SAULIS , L. and STATULEVI ˇCIUS , V. A. (1991). Limit theorems for large deviations. Springer Science & Business Media

work page 1991

[42] [42]

S PEED , T. P. (1983). Cumulants and partition lattices 1. Australian Journal of Statistics

work page 1983

[43] [43]

and W OROUDI , R

S PEICHER , R. and W OROUDI , R. (1997). Boolean convolutions. Fields Institute Communications

work page 1997

[44] [44]

(1969, 1970)

STATULEVI ˇCIUS , V. (1969, 1970). Limit theorems for the sums of random variables related to a Markov chain. I, II, III. Lithuanian Mathematical Journal

work page 1969

[45] [45]

T AO, T. (2012). Topics in random matrix theory. American Mathematical Soc

work page 2012

[46] [46]

TROPP, J. A. (2011). Freedman’s inequality for matrix martingales.Electronic Communications in Proba- bility

work page 2011

[47] [47]

TROPP, J. A. (2018). Second-order matrix concentration inequalities. Applied and Computational Harmonic Analysis

work page 2018

[48] [48]

and SANDERS , J

VAN WERDE , A., SENEN CERDA , A., KOSMELLA , G. and SANDERS , J. (2022). Detection and Evaluation of Clusters within Sequential Data. arXiv preprint arXiv:2210.01679

work page arXiv 2022

[49] [49]

, GIRARD , S

VLADIMIROVA , M. , GIRARD , S. , NGUYEN , H. and ARBEL , J. (2020). Sub-Weibull distributions: General- izing sub-Gaussian and sub-Exponential properties to heavier tailed distributions. Stat

work page 2020

[50] [50]

and WANG , M

ZHANG , A. and WANG , M. (2019). Spectral state compression of Markov processes. IEEE Transactions on Information Theory

work page 2019

[51] [51]

and W EI, H

Z HANG , H. and W EI, H. (2022). Sharper sub-Weibull concentrations. Mathematics. 41 APPENDIX A: PROOFS FOR THE MATRIX SERIES MODEL We here provide the details for the proofs of the universality of tracial moments and the variance thereof in the matrix series model. Recall that a sketch of the proof approach was provided in Section 5.5. A.1. Universality ...

work page 2022

[52] [52]

Consequently, using a similar estimate for the case t1 = t2 + 1, d n X |t1−t2|=1 Et1,t2(i, j, k, m) ≤ ( 2 d3 c2 2, if j ̸= k and i ̸= m, 2 d2 (c3 + 1 d c2

if j = k. Consequently, using a similar estimate for the case t1 = t2 + 1, d n X |t1−t2|=1 Et1,t2(i, j, k, m) ≤ ( 2 d3 c2 2, if j ̸= k and i ̸= m, 2 d2 (c3 + 1 d c2

work page

[53] [53]

(D.13) Finally, consider the case where |t1 − t2| > 1

if j = k or i = m. (D.13) Finally, consider the case where |t1 − t2| > 1. Then, if t2 > t1 + 1 d n n−3X t1=1 X t2>t1+1 Et1,t2(i, j, k, m) = d n n−2X s=2 (n − 1 − s)E1,1+s(i, j, k, m) (D.14) 52 = dP(E1 = (i, j)) n−2X s=2 n − 1 − s n P(E1+s = (k, m) | Z2 = j) − P(E1+s = (k, m)) = dP(E1 = (i, j))P(Z2+s = m | Z1+s = k) × n−2X s=2 n − 1 − s n P(Z1+s = k | Z2 =...

work page 2033