pith. sign in

arxiv: 1906.09639 · v1 · pith:JTHWLGGQnew · submitted 2019-06-23 · 🧮 math.ST · stat.TH

Asymptotic joint distribution of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model

Pith reviewed 2026-05-25 17:32 UTC · model grok-4.3

classification 🧮 math.ST stat.TH
keywords spiked population modelextreme eigenvaluestrace statisticsample covariance matrixjoint limiting distributionJohnson-Graybill testshigh-dimensional statisticsproportional growth regime
0
0 comments X

The pith

The joint limiting distribution of extreme eigenvalues and trace is derived for the generalized spiked population model with proportional growth.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes the joint asymptotic distribution of the extreme eigenvalues and the trace of the sample covariance matrix when the population covariance follows a generalized spiked model. A reader would care because these quantities underpin tests for the presence of signals or structure in high-dimensional multivariate data. The result is applied to Johnson-Graybill-type tests after adding a higher-order correction that reduces finite-sample bias. The proof proceeds by establishing joint convergence for the associated extreme spectral processes and linear spectral statistics.

Core claim

In the generalized spiked population model with a fixed number of spikes, under the proportional growth regime where dimension p and sample size n both tend to infinity with p/n approaching a constant, the suitably centered and scaled vector of extreme eigenvalues together with the trace converges in distribution to a multivariate normal limit whose covariance structure is explicitly determined from the model parameters.

What carries the argument

The joint asymptotic behavior of two classes of spectral processes, one for the extreme eigenvalues and one for the linear spectral statistics (trace).

If this is right

  • The joint distribution supplies the covariance terms needed for higher-order bias correction in Johnson-Graybill-type tests.
  • The tests for signals become more accurate in finite samples once the dependence between the extreme eigenvalues and the trace is accounted for.
  • The same joint convergence applies to any fixed number of the largest eigenvalues together with the trace.
  • The approach extends classical marginal limit results by treating the extreme and trace statistics simultaneously.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same technique could be used to obtain joint limits involving the trace and other smooth functionals of the spectrum.
  • In factor models or PCA applications the corrected critical values might reduce over- or under-detection of signals when p and n are comparable.
  • Numerical verification of the convergence rate would require generating data exactly under the spiked model and checking the empirical joint distribution against the theoretical normal.

Load-bearing premise

The data exactly follow the generalized spiked population model with a fixed number of spikes and the dimension and sample size grow proportionally as stated.

What would settle it

Monte Carlo simulations drawn from the generalized spiked model with known parameters would show that the normalized extreme eigenvalues and trace fail to jointly approach the predicted multivariate normal distribution.

read the original abstract

This paper studies the joint limiting behavior of extreme eigenvalues and trace of large sample covariance matrix in a generalized spiked population model, where the asymptotic regime is such that the dimension and sample size grow proportionally. The form of the joint limiting distribution is applied to conduct Johnson-Graybill-type tests, a family of approaches testing for signals in a statistical model. For this, higher order correction is further made, helping alleviate the impact of finite-sample bias. The proof rests on determining the joint asymptotic behavior of two classes of spectral processes, corresponding to the extreme and linear spectral statistics respectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper derives the joint limiting distribution of the extreme eigenvalues and the trace of the sample covariance matrix under a generalized spiked population model in the proportional growth regime (p/n → γ). The derivation proceeds via the joint asymptotics of extreme spectral processes and linear spectral statistics. The limiting joint law is then applied to Johnson-Graybill-type tests for the presence of signals, with an additional higher-order correction term introduced to mitigate finite-sample bias.

Significance. If the joint limiting result holds, the work supplies a technically useful extension of the spiked-model literature by coupling extreme-eigenvalue and trace statistics, which directly improves the calibration of signal-detection procedures. The explicit higher-order bias correction is a practical contribution that addresses a common limitation of first-order asymptotic approximations in moderate-dimensional settings.

minor comments (3)
  1. The abstract states that the proof rests on joint asymptotics of two classes of spectral processes but supplies no outline of the key steps or error bounds; a brief roadmap in §2 or §3 would help readers verify that the stated joint limit follows from the model assumptions.
  2. Notation for the generalized spiked model (population eigenvalues, spike locations, and the limiting ratio γ) should be collected in a single display early in the paper to avoid repeated re-definition.
  3. In the application section, the higher-order correction term is introduced without an explicit statement of the order of the remainder; adding this would clarify the improvement over the first-order approximation.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. No major comments appear in the report, so there are no specific points requiring point-by-point rebuttal. We will incorporate any minor editorial or presentational improvements in the revised version.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained asymptotic analysis

full rationale

The paper derives the joint limiting distribution of extreme eigenvalues and trace under the generalized spiked population model via joint asymptotics of extreme and linear spectral processes in the proportional regime. This is a standard first-principles random matrix theory argument with no reduction of any claimed result to fitted parameters, self-definitions, or load-bearing self-citations. The model assumptions and proof strategy are externally consistent with the literature on spiked covariance models; the result does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit list of fitted parameters, background axioms, or new postulated entities; the generalized spiked model is invoked but its precise assumptions are not enumerated.

pith-pipeline@v0.9.0 · 5622 in / 1043 out tokens · 29275 ms · 2026-05-25T17:32:29.046059+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 1 internal anchor

  1. [1]

    Bai, Z., Ding, X. (2012). Estimation of spiked eigenvalues in spiked models. Random Matrices: Theory and Applications , 1(2), 1150011

  2. [2]

    Bai, Z., Silverstein, J. (2004). CLT for linear spectral statistics of large-dimensional sample covariance matrices. The Annals of Probability , 32(1A), 553-605

  3. [3]

    Bai, Z., Yao, J. (2008). Central limit theorems for eigenvalues in a spiked population model. Annales de l'Institut Henri Poincar\'e, Probabilit\'es et Statistiques , 44(3), 447-474

  4. [4]

    Bai, Z., Yao, J. (2012). On sample eigenvalues in a generalized spiked population model. Journal of Multivariate Analysis , 106, 167-177

  5. [5]

    Baik, J., Silverstein, J. W. (2006). Eigenvalues of large sample covariance matrices of spiked population models. Journal of Multivariate Analysis , 97(6), 1382-1408

  6. [6]

    Baik, J., Arous, G., P\' e ch\' e , S. (2005). Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices. The Annals of Probability , 33(5), 1643-1697

  7. [7]

    Bhattacharjee, M., and Bose, A. (2016). Large sample behaviour of high dimensional autocovariance matrices. The Annals of Statistics , 44(2), 598-628

  8. [8]

    Bianchi, P., Debbah, M., Maida, M., Najim, J. (2011). Performance of statistical tests for single-source detection using random matrix theory. IEEE Transactions on Information Theory , 57(4), 2400-2419

  9. [9]

    (2015) CLT for linear spectral statistics of normalized sample covariance matrices with the dimension much larger than the sample size

    Chen, B., Pan, G. (2015) CLT for linear spectral statistics of normalized sample covariance matrices with the dimension much larger than the sample size. Bernoulli , 21(2), 1089-1133

  10. [10]

    Choi, Y., Taylor, J., and Tibshirani, R. (2017). Selecting the number of principal components: Estimation of the true rank of a noisy matrix. The Annals of Statistics , 45(6), 2590-2617

  11. [11]

    Chow, T., Teugels, J. (1978). The sum and the maximum of iid random variables. In Proceedings of the 2nd Prague Symposium on Asymptotic Statistics , 81-92

  12. [12]

    A., Heiny, J., Mikosch, T., and Xie, X

    Davis, R. A., Heiny, J., Mikosch, T., and Xie, X. (2016). Extreme value analysis for the sample autocovariance matrices of heavy-tailed multivariate time series. Extremes , 19(3), 517-547

  13. [13]

    Deo, R. (2016). On the Tracy–Widom approximation of studentized extreme eigenvalues of Wishart matrices. Journal of Multivariate Analysis , 147, 265-272

  14. [14]

    Hsing, T. (1995). A note on the asymptotic independence of the sum and maximum of strongly mixing stationary random variables. The Annals of Probability , 23(2), 938-947

  15. [15]

    Johnson, D., Graybill, F. (1972). An analysis of a two-way model with interaction and no replication. Journal of the American Statistical Association , 67(340), 862-868

  16. [16]

    Johnstone, I. (2001). On the distribution of the largest eigenvalue in principal components analysis. The Annals of Statistics , 29(2), 295-327

  17. [17]

    Knowles, A., Yin, J. (2017). Anisotropic local laws for random matrices. Probability Theory and Related Fields , 169(1), 257-352

  18. [18]

    Kritchman, S., Nadler, B. (2008). Determining the number of components in a factor model from limited noisy data. Chemometrics and Intelligent Laboratory Systems , 94(1), 19-32

  19. [19]

    Ma, Z. (2012). Accuracy of the Tracy-Widom limits for the extreme eigenvalues in white Wishart matrices. Bernoulli , 18(1), 322-359

  20. [20]

    Nadler, B. (2011). On the distribution of the ratio of the largest eigenvalue to the trace of a Wishart matrix. Journal of Multivariate Analysis , 102(2), 363-371

  21. [21]

    J., and Hallin, M

    Onatski, A., Moreira, M. J., and Hallin, M. (2013). Asymptotic power of sphericity tests for high-dimensional data. The Annals of Statistics , 41(3), 1204-1231

  22. [22]

    Paul, D. (2007). Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica , 17(4), 1617-1642

  23. [23]

    Paul, D., Aue, A. (2014). Random matrix theory in statistics: A review. Journal of Statistical Planning and Inference , 150, 1-29

  24. [24]

    Silverstein, J., Choi, S. (1995). Analysis of the limiting spectral distribution of large dimensional random matrices. Journal of Multivariate Analysis , 54(2), 295-309

  25. [25]

    Wang, W., Fan, J. (2017). Asymptotics of empirical eigenstructure for high dimensional spiked covariance model. The Annals of Statistics , 45(3), 1342-1374

  26. [26]

    W., Yao, J

    Wang, Q., Silverstein, J. W., Yao, J. (2014a). A note on the CLT of the LSS for sample covariance matrix from a spiked population model. Journal of Multivariate Analysis , 130, 194-207

  27. [27]

    Wang, Q., Su, Z., Yao, J. (2014b). Joint CLT for several random sesquilinear forms with applications to large-dimensional spiked population models. Electronic Journal of Probability , 19(103), 1-28

  28. [28]

    Yao, J., Zheng, S., Bai. Z. (2015). Large Sample Covariance Matrices and High-Dimensional Data Analysis. Cambridge University Press

  29. [29]

    Zheng, S. (2012). Central limit theorems for linear spectral statistics of large dimensional F-matrices. Annales de l'Institut Henri Poincar\'e, Probabilit\'es et Statistiques , 48(2), 444-476

  30. [30]

    Zheng, S., Bai, Z., Yao, J. (2015). Substitution principle for CLT of linear spectral statistics of high-dimensional sample covariance matrices with applications to hypothesis testing. The Annals of Statistics , 43(2), 546-591

  31. [31]

    Zheng, S., Bai, Z., Yao, J., Zhu, H. (2016). CLT for linear spectral statistics of large dimensional sample covariance matrices with dependent data. preprint arXiv:1708.03749