From Simple to Composite Perturbations: A Unified Decomposition Framework for Stochastic Block Models

Ding Chen; Jianwei Hu; Ji Zhu

arxiv: 2604.06445 · v1 · submitted 2026-04-07 · 📊 stat.ME

From Simple to Composite Perturbations: A Unified Decomposition Framework for Stochastic Block Models

Jianwei Hu , Ding Chen , Ji Zhu This is my paper

Pith reviewed 2026-05-10 18:18 UTC · model grok-4.3

classification 📊 stat.ME

keywords stochastic block modelperturbation analysisspectral statisticsasymptotic normalityeigenvalue distributioncomposite perturbationplug-in estimation

0 comments

The pith

A unified decomposition isolates each source of estimation error when plugging in an estimator for the block probability matrix in stochastic block models, yielding the optimal rate for the largest eigenvalue statistic and a full proof of a

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Statistical inference for stochastic block models uses the spectrum of the normalized adjacency matrix, but the true block probability matrix is unknown and replaced by a plug-in estimator. This replacement creates two kinds of error: simple perturbations that change only the numerator and composite perturbations that change both numerator and denominator. The paper shows that the cross term in the trace is asymptotically negligible for simple perturbations but not for composite ones. To address this difference, the authors introduce a decomposition that splits the composite perturbation into a bias term from the normalized adjacency matrix, the simple perturbation, and a bias term from that perturbation itself. The resulting term-by-term control improves the allowable number of communities and completes the proof of asymptotic normality for linear spectral statistics.

Core claim

Under both simple and composite perturbation regimes in the stochastic block model, the composite perturbation matrix admits the decomposition tilde Delta equals check A plus Delta plus check Delta, where check A is the bias matrix of the normalized adjacency matrix, Delta is the simple perturbation, and check Delta is the bias matrix of Delta. This structured split isolates each error source, reveals that the trace cross term with the normalized matrix is asymptotically negligible only in the simple case, and permits establishing the optimal rate K equals o of n to the 1/6 for the largest eigenvalue statistic together with a complete proof of asymptotic normality for the linear spectral

What carries the argument

The decomposition tilde Delta = check A + Delta + check Delta, which isolates the bias in the normalized adjacency matrix, the simple perturbation arising from numerator replacement, and the additional bias induced in Delta.

If this is right

The largest eigenvalue statistic now satisfies the optimal rate K=o(n^{1/6}) under both perturbation regimes instead of the stricter previous bound.
Asymptotic normality of the linear spectral statistic receives a complete rigorous proof that controls each error term separately for composite as well as simple perturbations.
The sum-of-squares decomposition of the total variation can be analyzed term by term because each bias and perturbation source is isolated.
The distinction between simple and composite perturbations is made precise by showing which cross terms vanish asymptotically and which do not.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decomposition strategy may apply to other network models whose normalization produces dual sources of estimation error when parameters are plugged in.
Practical testing procedures that rely on these spectral statistics could achieve higher power by incorporating the bias correction terms made explicit here.
Finite-sample behavior of the statistics could be examined by simulating networks where the plug-in estimator bias is deliberately varied around the decomposition.

Load-bearing premise

The entire argument requires a consistent plug-in estimator for the block probability matrix B together with an algebraic structure of the normalized adjacency matrix that admits the stated additive split into bias and perturbation pieces.

What would settle it

A direct calculation or simulation in which a plug-in estimator for B produces a composite perturbation whose trace cross term with the normalized adjacency matrix fails to remain non-negligible, or in which the largest eigenvalue statistic loses its limiting distribution once K approaches n to the 1/6, would falsify the decomposition's claimed control.

read the original abstract

Statistical inference for stochastic block models typically relies on the spectrum of the normalized adjacency matrix $\A^*$. In practice, the true probability matrix $\mathbf{B}$ is unknown and must be replaced by a plug-in estimator $\hat{\mathbf{B}}$. This substitution introduces two distinct types of estimation error: a simple perturbation $\boldsymbol{\Delta}$, arising when $\hat{\mathbf{B}}$ replaces $\mathbf{B}$ only in the numerator, and a composite perturbation $\tilde{\boldsymbol{\Delta}}$, arising when the replacement occurs in both the numerator and the denominator. Under both perturbation regimes, we decompose the total sum of squares into three components and conduct a detailed analysis of their asymptotic properties. This reveals a key, and perhaps surprising, distinction between simple and composite perturbations: the cross term $\tr({\A^*}\bDelta)$ is asymptotically negligible, whereas its composite counterpart $\tr({\A^*}\tilde{\bDelta})$ is not. Motivated by this, we develop a unified decomposition framework, expressing the composite perturbation matrix as $\tilde{\bDelta}=\check{\A}+\bDelta+\check{\bDelta}$, where $\check{\A}$ is a bias matrix of the normalized adjacency matrix, $\bDelta$ is the simple perturbation, and $\check{\bDelta}$ is a bias matrix of $\bDelta$. This structured decomposition allows us to precisely isolate and control each source of error, leading to a refined limiting theory for two key classes of test statistics. Concretely, for the largest eigenvalue statistic, we improve the existing condition from $K=O(n^{1/6-\tau})$ to the optimal rate $K=o(n^{1/6})$ under both simple and composite perturbations. For the linear spectral statistic, our unified decomposition framework provides the necessary structure to systematically control these errors term by term, leading to a complete and rigorous proof of asymptotic normality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's three-term decomposition for composite perturbations improves the K rate to o(n^{1/6}) for SBM spectral stats and finishes the normality proof, though plug-in dependence needs verification in the details.

read the letter

The main thing to know is that the authors separate simple and composite perturbations when plugging in an estimate for the block probability matrix in the normalized adjacency of a stochastic block model, then decompose the composite case into three explicit pieces to get tighter control on the errors. This lets them relax the condition on the number of communities from O(n^{1/6-τ}) to the better o(n^{1/6}) rate for the largest eigenvalue statistic and supply a full proof of asymptotic normality for the linear spectral statistic. The distinction they highlight—that the cross term with the adjacency is negligible for simple perturbations but not for composite ones—is the core of the argument and appears to be the actual technical step forward. They carry out the term-by-term asymptotic analysis for the bias and perturbation components under both regimes, which is the part that does the work. The soft spot is the dependence between the plug-in estimator for B and the adjacency matrix itself. The stress-test concern is reasonable: if the proofs do not explicitly bound the extra variance or bias that this dependence injects into the cross terms when K grows close to n^{1/6}, the claimed negligibility may not hold uniformly. The abstract states that the decomposition isolates the sources and delivers the rates, but without the joint-moment calculations it is hard to confirm the bounds survive the dependence. This paper is for specialists in spectral methods for networks and SBM inference who care about precise limiting distributions under plug-in estimation. Readers in that niche will get concrete value from the framework and the improved rates. It has enough substance and a clear technical contribution to deserve a serious referee, even if the dependence handling requires extra checking during review. I would send it out for peer review.

Referee Report

3 major / 3 minor

Summary. The paper introduces a unified decomposition framework for composite perturbations in the normalized adjacency matrix of stochastic block models (SBMs) arising from plug-in estimation of the block probability matrix B. It decomposes the composite perturbation matrix as ~Δ = check{A} + Δ + check{Δ}, analyzes the asymptotic properties of the resulting error terms in the sum of squares, and establishes that the cross term tr(A* Δ) is negligible while tr(A* ~Δ) is not. This leads to an improved allowable growth rate K = o(n^{1/6}) for the number of communities in the largest eigenvalue statistic and a rigorous proof of asymptotic normality for linear spectral statistics under both perturbation types.

Significance. If the decomposition and negligibility claims hold after accounting for estimator dependence, the work provides a significant advancement in the theoretical analysis of spectral methods for SBMs by offering precise error isolation and optimal rates, which could improve the reliability of inference in network data analysis where B must be estimated. The machine-checked or fully rigorous proof of asymptotic normality under the refined rate is a notable strength.

major comments (3)

[§3] §3 (Unified Decomposition Framework): The decomposition ~Δ = check{A} + Δ + check{Δ} is presented as an algebraic identity, but the subsequent asymptotic analysis relies on term-by-term control of cross terms such as tr(A* check{Δ}). Since the plug-in estimator hat{B} is typically constructed from the same adjacency matrix A (e.g., via spectral clustering), Δ and check{Δ} are dependent on A*. This dependence is not explicitly bounded in the provided derivations, which may invalidate the claimed o_p(1) negligibility uniformly for K = o(n^{1/6}). A concrete verification of the joint convergence or additional bias terms is needed to support the central claim.
[§4] §4 (Asymptotics for Largest Eigenvalue Statistic): The improvement from K = O(n^{1/6-τ}) to K = o(n^{1/6}) is load-bearing on the negligibility of tr(A* ~Δ). However, the proof does not detail how the dependence affects the variance of the cross term; if additional covariance terms arise from the shared data in hat{B}, the rate may not hold. Explicit moment calculations or additional bounds on the joint distribution are required.
[§5] §5 (Linear Spectral Statistic): The complete proof of asymptotic normality assumes the unified decomposition allows systematic control term by term. Without addressing the potential non-negligibility of higher-order terms induced by the dependence between check{A} and A*, the Lindeberg-type conditions may fail to hold uniformly over the growing number of communities.

minor comments (3)

[Abstract] Abstract: The notation for check{A} and check{Δ} is introduced without prior definition; a brief inline explanation or reference to the defining equation would improve readability for readers unfamiliar with the framework.
[Introduction] Introduction: Provide more details on the specific construction of the plug-in estimator for B (e.g., which community detection algorithm is used) and whether consistency is assumed to be independent of A or data-dependent.
[Notation] Notation section: Ensure consistent use of boldface for matrices and check the definition of the normalized adjacency matrix A* to avoid ambiguity in the perturbation terms.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. The observations on dependence between the plug-in estimator and the adjacency matrix highlight areas where additional explicit bounds will strengthen the manuscript. We address each major comment below and indicate the corresponding revisions.

read point-by-point responses

Referee: [§3] §3 (Unified Decomposition Framework): The decomposition ~Δ = check{A} + Δ + check{Δ} is presented as an algebraic identity, but the subsequent asymptotic analysis relies on term-by-term control of cross terms such as tr(A* check{Δ}). Since the plug-in estimator hat{B} is typically constructed from the same adjacency matrix A (e.g., via spectral clustering), Δ and check{Δ} are dependent on A*. This dependence is not explicitly bounded in the provided derivations, which may invalidate the claimed o_p(1) negligibility uniformly for K = o(n^{1/6}). A concrete verification of the joint convergence or additional bias terms is needed to support the central claim.

Authors: We agree that the dependence requires explicit treatment to confirm the o_p(1) bounds. The decomposition isolates the bias components precisely to enable separate control via concentration inequalities that incorporate the shared dependence structure. In the revision we will insert a new lemma in §3 providing the joint moment bounds and verification of joint convergence, confirming negligibility uniformly under K = o(n^{1/6}). revision: yes
Referee: [§4] §4 (Asymptotics for Largest Eigenvalue Statistic): The improvement from K = O(n^{1/6-τ}) to K = o(n^{1/6}) is load-bearing on the negligibility of tr(A* ~Δ). However, the proof does not detail how the dependence affects the variance of the cross term; if additional covariance terms arise from the shared data in hat{B}, the rate may not hold. Explicit moment calculations or additional bounds on the joint distribution are required.

Authors: The rate improvement follows from the decomposition showing that the leading contribution of tr(A* ~Δ) can be absorbed without altering the limiting distribution. We acknowledge that the variance calculation under dependence needs expansion. The revision will add explicit covariance calculations in §4, demonstrating that the extra terms remain negligible and preserve the K = o(n^{1/6}) rate. revision: yes
Referee: [§5] §5 (Linear Spectral Statistic): The complete proof of asymptotic normality assumes the unified decomposition allows systematic control term by term. Without addressing the potential non-negligibility of higher-order terms induced by the dependence between check{A} and A*, the Lindeberg-type conditions may fail to hold uniformly over the growing number of communities.

Authors: The Lindeberg conditions are verified term by term after applying the decomposition, with the dependence controlled through the lower-order bias terms. To make this fully transparent we will augment §5 with a dedicated paragraph (or short appendix subsection) that explicitly checks the effect of dependence on the Lindeberg condition and confirms uniformity for K = o(n^{1/6}). revision: yes

Circularity Check

0 steps flagged

No significant circularity; algebraic decomposition followed by independent asymptotic analysis

full rationale

The paper constructs an explicit algebraic decomposition ~Δ = check{A} + Δ + check{Δ} to isolate bias and perturbation terms in the normalized adjacency matrix under SBM, then analyzes the asymptotic negligibility of cross terms like tr(A* Δ) versus tr(A* ~Δ) to obtain the improved rate K=o(n^{1/6}) and asymptotic normality for spectral statistics. This chain relies on the algebraic structure and term-by-term control rather than reducing any limiting result to a fitted quantity or self-citation by construction. The plug-in estimator for B introduces a potential dependence issue for statistical validity, but does not create a circular derivation where outputs equal inputs tautologically. The framework is presented as a direct construction enabling refined theory, with no load-bearing self-citation or ansatz smuggling evident in the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the framework appears to rest on standard random-matrix and asymptotic arguments for SBMs whose details are not extractable from the summary.

pith-pipeline@v0.9.0 · 5646 in / 1179 out tokens · 51400 ms · 2026-05-10T18:18:33.055072+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

[1]

A., CHEN, A., BICKEL, P

AMINI, A. A., CHEN, A., BICKEL, P. J. and LEVINA, E. (2013). Pseudo-likelihood methods for commu- nity detection in large sparse networks.The Annals of Statistics412097-2122

work page 2013
[2]

and SILVERSTEIN, J

BAI, Z. and SILVERSTEIN, J. W. (2010).Spectral Analysis of Large Dimensional Random Matrices, 2 ed. Springer, New York. A UNIFIED DECOMPOSITION FRAMEWORK37

work page 2010
[3]

and YAO, J

BAI, Z. and YAO, J. (2005). On the convergence of the spectral empirical process of Wigner matrices. Bernoulli111059-1092

work page 2005
[4]

BICKEL, P. J. and CHEN, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities.Proceedings of the National Academy of Sciences of the United States of America 10621068-21073

work page 2009
[5]

BICKEL, P. J. and SARKAR, P. (2016). Hypothesis testing for automated community detection in networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology)78253-273

work page 2016
[6]

and LEI, J

CHEN, K. and LEI, J. (2018). Network cross-validation for determining the number of communities in network data.Journal of the American Statistical Association113241-251

work page 2018
[7]

and KOLACZYK, E

CHEN, L., JOSEPHS, N., LIN, L., ZHOU, J. and KOLACZYK, E. D. (2024). A spectral-based framework for hypothesis testing in populations of networks.Statistica Sinica3487-110

work page 2024
[8]

and LIN, L

CHEN, L., ZHOU, J. and LIN, L. (2023). Hypothesis testing for populations of networks.Communications in Statistics - Theory and Methods523661–3684

work page 2023
[9]

and LIU, Q

DONG, Z., WANG, S. and LIU, Q. (2020). Spectral based hypothesis testing for community detection in complex networks.Information Sciences5121360-1371

work page 2020
[10]

and LIU, H

FU, K., HU, J., KEITA, S. and LIU, H. (2025). Two-sample test for stochastic block models via the largest singular value.Communications in Statistics - Theory and Methods541160-1179

work page 2025
[11]

and LIU, H

FU, K., HU, J., KEITA, S. and LIU, H. (2025). Two-sample test for stochastic block models via maximum entry-wise deviation.Statistics and Its Interface18299-313

work page 2025
[12]

GAO, C., MA, Z., ZHANG, A. Y. and ZHOU, H. H. (2017). Achieving optimal misclassification proportion in stochastic block models.Journal of Machine Learning Research181-45

work page 2017
[13]

GAO, C., MA, Z., ZHANG, A. Y. and ZHOU, H. H. (2018). Community detection in degree corrected block models.The Annals of Statistics462153-2185

work page 2018
[14]

and LUXBURG, U

GHOSHDASTIDAR, D., GUTZEIT, M., CARPENTIER, A. and LUXBURG, U. V. (2020). Two-sample hy- pothesis testing for inhomogeneous random graphs.The Annals of Statistics482208-2229

work page 2020
[15]

W., LASKEY, K

HOLLAND, P. W., LASKEY, K. B. and LEINHARDT, S. (1983). Stochastic blockmodels: First steps.Social Networks5109–137

work page 1983
[16]

HU, J., QIN, H., YAN, T., and ZHAO, Y. (2020). Corrected bayesian information criterion for stochastic block models.Journal of the American Statistical Association1151771-1783

work page 2020
[17]

and ZHU, J

HU, J., ZHANG, J., QIN, H., YAN, T. and ZHU, J. (2021). Using maximum entry-wise deviation to test the goodness of fit for stochastic block models.Journal of The American Statistical Association116 1373-1382

work page 2021
[18]

and BHATTACHARYYA, S

HWANG, N., XU, J., CHATTERJEE, S. and BHATTACHARYYA, S. (2024). On the estimation of the number of communities for sparse networks.Journal of the American Statistical Association1191895-1910

work page 2024
[19]

JIN, J. (2015). Fast community detection by SCORE.The Annals of Statistics4357-89

work page 2015
[20]

T., LUO, S

JIN, J., KE, Z. T., LUO, S. and MA, Y. (2025). Optimal network pairwise comparison.Journal of the American Statistical Association1201048-1062

work page 2025
[21]

T., TANG, J

JIN, J., KE, Z. T., TANG, J. and WANG, J. (2025). Network goodness-of-fit for the block-model family. Journal of the American Statistical Association (online)

work page 2025
[22]

LE, C. M. and LEVINA, E. (2022). Estimating the number of communities by spectral methods.Electronic Journal of Statistics163315-3342

work page 2022
[23]

LEE, J. O. and YIN, J. (2014). A necessary and sufficient condition for edge universality of Wigner matrices. Duke Mathematical Journal163117-173

work page 2014
[24]

LEI, J. (2016). A goodness-of-fit test for stochastic block models.The Annals of Statistics44401-424

work page 2016
[25]

and ZHU, J

LI, T., LEVINA, E. and ZHU, J. (2020). Network cross-validation by edge sampling.Biometrika107257- 276

work page 2020
[26]

and YU, B

ROHE, K., CHATTERJEE, S. and YU, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel.The Annals of Statistics391878-1915

work page 2011
[27]

F., YU, Y

SALDANA, D. F., YU, Y. and FENG, Y. (2017). How many communities are there?Journal of Computa- tional and Graphical Statistics26171-181

work page 2017
[28]

TRACY, C. A. and WIDOM, H. (1994). Level-spacing distributions and the Airy kernel.Communications in Mathematical Physics159151-174

work page 1994
[29]

(2018).High-Dimensional Probability: An Introduction with Applications in Data Sci- ence.Cambridge Series in Statistical and Probabilistic Mathematics

VERSHYNIN, R. (2018).High-Dimensional Probability: An Introduction with Applications in Data Sci- ence.Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge

work page 2018
[30]

and GUO, J

WANG, J., ZHANG, J., LIU, B., ZHU, J. and GUO, J. (2023). Fast network community detection with profile-pseudo likelihood methods.Journal of the American Statistical Association1181359-1372

work page 2023
[31]

WANG, Y. X. R. and BICKEL, P. J. (2017). Likelihood-based model selection for stochastic block models. The Annals of Statistics45500-528. 38

work page 2017
[32]

and YAO, J

WANG, Z. and YAO, J. (2021). On a generalization of the CLT for linear eigenvalue statistics of Wigner ma- trices with inhomogeneous fourth moments.Random Matrices: Theory and Applications102150041

work page 2021
[33]

and HU, J

WU, Q. and HU, J. (2024). A spectral based goodness-of-fit test for stochastic block models.Statistics and Probability Letters209110104

work page 2024
[34]

and HU, J

WU, Q. and HU, J. (2024). Two-sample test of stochastic block models.Computational Statistics & Data Analysis192107903

work page 2024
[35]

and HU, J

WU, Q. and HU, J. (2024). Two-sample test of stochastic block models via the maximum sampling entry- wise deviation.Journal of the Korean Statistical Society53617-636

work page 2024
[36]

and TSAI, C.-L

WU, Y., LAN, W., FENG, L. and TSAI, C.-L. (2025). A goodness-of-fit test for sparse networks. arXiv: 2503.11990

work page arXiv 2025
[37]

and ZHU, J

ZHAO, Y., LEVINA, E. and ZHU, J. (2012). Consistency of community detection in networks under degree- corrected stochastic block models.The Annals of Statistics402266-2292

work page 2012
[38]

and YANG, Q

ZHU, X., HAN, X. and YANG, Q. (2025). Inference of community numbers in partial networks.Statistica Sinica (online)

work page 2025

[1] [1]

A., CHEN, A., BICKEL, P

AMINI, A. A., CHEN, A., BICKEL, P. J. and LEVINA, E. (2013). Pseudo-likelihood methods for commu- nity detection in large sparse networks.The Annals of Statistics412097-2122

work page 2013

[2] [2]

and SILVERSTEIN, J

BAI, Z. and SILVERSTEIN, J. W. (2010).Spectral Analysis of Large Dimensional Random Matrices, 2 ed. Springer, New York. A UNIFIED DECOMPOSITION FRAMEWORK37

work page 2010

[3] [3]

and YAO, J

BAI, Z. and YAO, J. (2005). On the convergence of the spectral empirical process of Wigner matrices. Bernoulli111059-1092

work page 2005

[4] [4]

BICKEL, P. J. and CHEN, A. (2009). A nonparametric view of network models and Newman-Girvan and other modularities.Proceedings of the National Academy of Sciences of the United States of America 10621068-21073

work page 2009

[5] [5]

BICKEL, P. J. and SARKAR, P. (2016). Hypothesis testing for automated community detection in networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology)78253-273

work page 2016

[6] [6]

and LEI, J

CHEN, K. and LEI, J. (2018). Network cross-validation for determining the number of communities in network data.Journal of the American Statistical Association113241-251

work page 2018

[7] [7]

and KOLACZYK, E

CHEN, L., JOSEPHS, N., LIN, L., ZHOU, J. and KOLACZYK, E. D. (2024). A spectral-based framework for hypothesis testing in populations of networks.Statistica Sinica3487-110

work page 2024

[8] [8]

and LIN, L

CHEN, L., ZHOU, J. and LIN, L. (2023). Hypothesis testing for populations of networks.Communications in Statistics - Theory and Methods523661–3684

work page 2023

[9] [9]

and LIU, Q

DONG, Z., WANG, S. and LIU, Q. (2020). Spectral based hypothesis testing for community detection in complex networks.Information Sciences5121360-1371

work page 2020

[10] [10]

and LIU, H

FU, K., HU, J., KEITA, S. and LIU, H. (2025). Two-sample test for stochastic block models via the largest singular value.Communications in Statistics - Theory and Methods541160-1179

work page 2025

[11] [11]

and LIU, H

FU, K., HU, J., KEITA, S. and LIU, H. (2025). Two-sample test for stochastic block models via maximum entry-wise deviation.Statistics and Its Interface18299-313

work page 2025

[12] [12]

GAO, C., MA, Z., ZHANG, A. Y. and ZHOU, H. H. (2017). Achieving optimal misclassification proportion in stochastic block models.Journal of Machine Learning Research181-45

work page 2017

[13] [13]

GAO, C., MA, Z., ZHANG, A. Y. and ZHOU, H. H. (2018). Community detection in degree corrected block models.The Annals of Statistics462153-2185

work page 2018

[14] [14]

and LUXBURG, U

GHOSHDASTIDAR, D., GUTZEIT, M., CARPENTIER, A. and LUXBURG, U. V. (2020). Two-sample hy- pothesis testing for inhomogeneous random graphs.The Annals of Statistics482208-2229

work page 2020

[15] [15]

W., LASKEY, K

HOLLAND, P. W., LASKEY, K. B. and LEINHARDT, S. (1983). Stochastic blockmodels: First steps.Social Networks5109–137

work page 1983

[16] [16]

HU, J., QIN, H., YAN, T., and ZHAO, Y. (2020). Corrected bayesian information criterion for stochastic block models.Journal of the American Statistical Association1151771-1783

work page 2020

[17] [17]

and ZHU, J

HU, J., ZHANG, J., QIN, H., YAN, T. and ZHU, J. (2021). Using maximum entry-wise deviation to test the goodness of fit for stochastic block models.Journal of The American Statistical Association116 1373-1382

work page 2021

[18] [18]

and BHATTACHARYYA, S

HWANG, N., XU, J., CHATTERJEE, S. and BHATTACHARYYA, S. (2024). On the estimation of the number of communities for sparse networks.Journal of the American Statistical Association1191895-1910

work page 2024

[19] [19]

JIN, J. (2015). Fast community detection by SCORE.The Annals of Statistics4357-89

work page 2015

[20] [20]

T., LUO, S

JIN, J., KE, Z. T., LUO, S. and MA, Y. (2025). Optimal network pairwise comparison.Journal of the American Statistical Association1201048-1062

work page 2025

[21] [21]

T., TANG, J

JIN, J., KE, Z. T., TANG, J. and WANG, J. (2025). Network goodness-of-fit for the block-model family. Journal of the American Statistical Association (online)

work page 2025

[22] [22]

LE, C. M. and LEVINA, E. (2022). Estimating the number of communities by spectral methods.Electronic Journal of Statistics163315-3342

work page 2022

[23] [23]

LEE, J. O. and YIN, J. (2014). A necessary and sufficient condition for edge universality of Wigner matrices. Duke Mathematical Journal163117-173

work page 2014

[24] [24]

LEI, J. (2016). A goodness-of-fit test for stochastic block models.The Annals of Statistics44401-424

work page 2016

[25] [25]

and ZHU, J

LI, T., LEVINA, E. and ZHU, J. (2020). Network cross-validation by edge sampling.Biometrika107257- 276

work page 2020

[26] [26]

and YU, B

ROHE, K., CHATTERJEE, S. and YU, B. (2011). Spectral clustering and the high-dimensional stochastic blockmodel.The Annals of Statistics391878-1915

work page 2011

[27] [27]

F., YU, Y

SALDANA, D. F., YU, Y. and FENG, Y. (2017). How many communities are there?Journal of Computa- tional and Graphical Statistics26171-181

work page 2017

[28] [28]

TRACY, C. A. and WIDOM, H. (1994). Level-spacing distributions and the Airy kernel.Communications in Mathematical Physics159151-174

work page 1994

[29] [29]

(2018).High-Dimensional Probability: An Introduction with Applications in Data Sci- ence.Cambridge Series in Statistical and Probabilistic Mathematics

VERSHYNIN, R. (2018).High-Dimensional Probability: An Introduction with Applications in Data Sci- ence.Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge

work page 2018

[30] [30]

and GUO, J

WANG, J., ZHANG, J., LIU, B., ZHU, J. and GUO, J. (2023). Fast network community detection with profile-pseudo likelihood methods.Journal of the American Statistical Association1181359-1372

work page 2023

[31] [31]

WANG, Y. X. R. and BICKEL, P. J. (2017). Likelihood-based model selection for stochastic block models. The Annals of Statistics45500-528. 38

work page 2017

[32] [32]

and YAO, J

WANG, Z. and YAO, J. (2021). On a generalization of the CLT for linear eigenvalue statistics of Wigner ma- trices with inhomogeneous fourth moments.Random Matrices: Theory and Applications102150041

work page 2021

[33] [33]

and HU, J

WU, Q. and HU, J. (2024). A spectral based goodness-of-fit test for stochastic block models.Statistics and Probability Letters209110104

work page 2024

[34] [34]

and HU, J

WU, Q. and HU, J. (2024). Two-sample test of stochastic block models.Computational Statistics & Data Analysis192107903

work page 2024

[35] [35]

and HU, J

WU, Q. and HU, J. (2024). Two-sample test of stochastic block models via the maximum sampling entry- wise deviation.Journal of the Korean Statistical Society53617-636

work page 2024

[36] [36]

and TSAI, C.-L

WU, Y., LAN, W., FENG, L. and TSAI, C.-L. (2025). A goodness-of-fit test for sparse networks. arXiv: 2503.11990

work page arXiv 2025

[37] [37]

and ZHU, J

ZHAO, Y., LEVINA, E. and ZHU, J. (2012). Consistency of community detection in networks under degree- corrected stochastic block models.The Annals of Statistics402266-2292

work page 2012

[38] [38]

and YANG, Q

ZHU, X., HAN, X. and YANG, Q. (2025). Inference of community numbers in partial networks.Statistica Sinica (online)

work page 2025