pith. sign in

arxiv: 2507.05084 · v2 · submitted 2025-07-07 · 💻 cs.LG · stat.ML

Distribution-dependent Generalization Bounds for Tuning Linear Regression Across Tasks

Pith reviewed 2026-05-19 06:18 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords generalization boundsmulti-task learninglinear regressionregularization tuningsub-Gaussian distributionsdistribution-dependent boundshigh-dimensional datavalidation loss
0
0 comments X

The pith

Under common distributional assumptions, tuning regularization hyperparameters in multi-task linear regression yields generalization bounds that do not worsen with feature dimension.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops generalization bounds for the validation loss when tuning L1 and L2 regularization coefficients across multiple related tasks in linear regression. These bounds depend on the data distribution and improve as the distribution becomes nicer, such as when instances are i.i.d. sub-Gaussian. This contrasts with prior uniform bounds that necessarily degrade as the feature dimension d increases. The results apply to ridge, lasso, and elastic net, and extend to a generalized ridge regression that incorporates an estimate of the mean. A sympathetic reader would care because this allows for effective hyperparameter tuning in high-dimensional settings without the bounds blowing up.

Core claim

The central claim is that distribution-dependent generalization bounds for tuning the L1 and L2 coefficients in multi-task linear regression do not get worse with increasing feature dimension d when instances within each task are i.i.d. draws from sub-Gaussian or similar distributions, and these bounds are much sharper than those from prior work for very large d. The bounds are obtained for the validation loss and also extended to a generalized form of ridge regression using mean estimates of the ground truth.

What carries the argument

Distribution-dependent generalization bounds on the validation loss for hyperparameter tuning, derived using concentration properties of sub-Gaussian random variables across tasks.

If this is right

  • The generalization error bounds stay controlled even for arbitrarily large feature dimensions.
  • Performance guarantees are tighter than uniform bounds when data is well-behaved.
  • Results cover standard regularization methods including ridge, lasso, and elastic net.
  • Tighter bounds are possible for generalized ridge regression by estimating the mean of the target distribution.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach could enable scaling multi-task learning to very high-dimensional problems in practice without bound degradation.
  • Similar distribution-dependent analyses might apply to hyperparameter tuning in other models beyond linear regression.
  • Practitioners could leverage these bounds for more reliable regularization choices in high-d regimes with structured data.

Load-bearing premise

The instances within each task must be i.i.d. draws from sub-Gaussian or analogous well-studied distribution classes.

What would settle it

Empirical observation that the generalization error of the tuned parameters increases with feature dimension d on data satisfying the i.i.d. sub-Gaussian assumption would disprove the bounds.

read the original abstract

Modern regression problems often involve high-dimensional data and a careful tuning of the regularization hyperparameters is crucial to avoid overly complex models that may overfit the training data while guaranteeing desirable properties like effective variable selection. We study the recently introduced direction of tuning regularization hyperparameters in linear regression across multiple related tasks. We obtain distribution-dependent bounds on the generalization error for the validation loss when tuning the L1 and L2 coefficients, including ridge, lasso and the elastic net. In contrast, prior work develops bounds that apply uniformly to all distributions, but such bounds necessarily degrade with feature dimension, d. While these bounds are shown to be tight for worst-case distributions, our bounds improve with the "niceness" of the data distribution. Concretely, we show that under additional assumptions that instances within each task are i.i.d. draws from broad well-studied classes of distributions including sub-Gaussians, our generalization bounds do not get worse with increasing d, and are much sharper than prior work for very large d. We also extend our results to a generalization of ridge regression, where we achieve tighter bounds that take into account an estimate of the mean of the ground truth distribution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript derives distribution-dependent generalization bounds for the validation loss when tuning L1 and L2 regularization hyperparameters (including ridge, lasso, and elastic net) in a multi-task linear regression setting. Under the explicit assumption that instances within each task are i.i.d. draws from sub-Gaussian or similar well-studied distribution classes, the bounds remain independent of feature dimension d and are sharper than prior uniform bounds for large d. The results are extended to a generalized ridge regression that incorporates an estimate of the mean of the ground truth distribution.

Significance. If the central derivations hold, the work offers a meaningful advance by replacing worst-case uniform bounds (which necessarily degrade with d) with sharper, distribution-dependent guarantees that exploit common niceness properties of real data. This is particularly relevant for practical high-dimensional multi-task regression, where hyperparameter tuning must balance overfitting and variable selection without overly pessimistic dimension dependence.

minor comments (2)
  1. [Introduction / Problem Setup] The abstract and introduction refer to 'broad well-studied classes of distributions including sub-Gaussians' without listing the precise classes for which the theorems are proved; this should be stated explicitly in the problem setup section.
  2. [Preliminaries] In the multi-task notation, the symbols for the number of tasks T and samples per task n are introduced without a dedicated table or clear cross-reference to the single-task case; this creates minor ambiguity when comparing to prior single-task bounds.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive review and recommendation for minor revision. The referee's summary accurately reflects the paper's focus on deriving distribution-dependent generalization bounds for tuning L1/L2 regularization in multi-task linear regression, which remain independent of dimension d under sub-Gaussian i.i.d. assumptions and improve upon uniform bounds for large d.

Circularity Check

0 steps flagged

No significant circularity; bounds derived from explicit distributional assumptions

full rationale

The paper derives distribution-dependent generalization bounds for tuning L1/L2 regularization in multi-task linear regression. The central argument invokes the explicit assumption that instances within each task are i.i.d. draws from sub-Gaussian or similar well-studied distribution classes; this directly yields bounds independent of dimension d that improve over uniform worst-case bounds for large d. No load-bearing step reduces by construction to a fitted parameter, self-citation chain, or renamed input. The derivation remains self-contained with independent content supplied by the stated distributional niceness and standard concentration tools.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard statistical assumptions about data distributions rather than new free parameters or invented entities.

axioms (1)
  • domain assumption Instances within each task are i.i.d. draws from sub-Gaussian or similar well-studied distribution classes
    Invoked in the abstract as the additional assumption enabling bounds that do not worsen with increasing d.

pith-pipeline@v0.9.0 · 5741 in / 1228 out tokens · 74555 ms · 2026-05-19T06:18:02.655096+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages

  1. [1]

    Adversarial learning guarantees for linear hypotheses and neural networks

    Pranjal Awasthi, Natalie Frank, and Mehryar Mohri. Adversarial learning guarantees for linear hypotheses and neural networks. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 431--441. PMLR, 13--18 Jul 2020

  2. [2]

    Data-Driven Algorithm Design (book chapter)

    Maria-Florina Balcan. Data-Driven Algorithm Design (book chapter). In Beyond Worst-Case Analysis of Algorithms, Tim Roughgarden (Ed). Cambridge University Press , 2020

  3. [3]

    Data driven semi-supervised learning

    Maria-Florina Balcan and Dravyansh Sharma. Data driven semi-supervised learning. Advances in Neural Information Processing Systems (NeurIPS), 34: 0 14782--14794, 2021

  4. [4]

    Learning accurate and interpretable decision trees

    Maria-Florina Balcan and Dravyansh Sharma. Learning accurate and interpretable decision trees. In Uncertainty in Artificial Intelligence, pages 288--307. PMLR, 2024

  5. [5]

    Learning-theoretic foundations of algorithm configuration for combinatorial partitioning problems

    Maria-Florina Balcan, Vaishnavh Nagarajan, Ellen Vitercik, and Colin White. Learning-theoretic foundations of algorithm configuration for combinatorial partitioning problems. In Conference on Learning Theory, pages 213--274. PMLR, 2017

  6. [6]

    Dispersion for data-driven algorithm design, online learning, and private optimization

    Maria-Florina Balcan, Travis Dick, and Ellen Vitercik. Dispersion for data-driven algorithm design, online learning, and private optimization. In 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS), pages 603--614. IEEE, 2018 a

  7. [7]

    Data-driven clustering via parameterized L loyd's families

    Maria-Florina Balcan, Travis Dick, and Colin White. Data-driven clustering via parameterized L loyd's families. Advances in Neural Information Processing Systems (NeurIPS), 31, 2018 b

  8. [8]

    A general theory of sample complexity for multi-item profit maximization

    Maria-Florina Balcan, Tuomas Sandholm, and Ellen Vitercik. A general theory of sample complexity for multi-item profit maximization. In Proceedings of the 2018 ACM Conference on Economics and Computation, pages 173--174, 2018 c

  9. [9]

    Learning to link

    Maria-Florina Balcan, Travis Dick, and Manuel Lang. Learning to link. In International Conference on Learning Representation (ICLR), 2020 a

  10. [10]

    Learning piecewise L ipschitz functions in changing environments

    Maria-Florina Balcan, Travis Dick, and Dravyansh Sharma. Learning piecewise L ipschitz functions in changing environments. In International Conference on Artificial Intelligence and Statistics, pages 3567--3577. PMLR, 2020 b

  11. [11]

    How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design

    Maria-Florina Balcan, Dan DeBlasio, Travis Dick, Carl Kingsford, Tuomas Sandholm, and Ellen Vitercik. How much data is sufficient to learn high-performing algorithms? generalization guarantees for data-driven algorithm design. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC), pages 919--932, 2021 a

  12. [12]

    Learning-to-learn non-convex piecewise- L ipschitz functions

    Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, and Ameet Talwalkar. Learning-to-learn non-convex piecewise- L ipschitz functions. Advances in Neural Information Processing Systems, 34: 0 15056--15069, 2021 b

  13. [13]

    Provably tuning the ElasticNet across instances

    Maria-Florina Balcan, Misha Khodak, Dravyansh Sharma, and Ameet Talwalkar. Provably tuning the ElasticNet across instances. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 27769--27782. Curran Associates, Inc., 2022 a

  14. [14]

    Structural analysis of branch-and-cut and the learnability of G omory mixed integer cuts

    Maria-Florina Balcan, Siddharth Prasad, Tuomas Sandholm, and Ellen Vitercik. Structural analysis of branch-and-cut and the learnability of G omory mixed integer cuts. Advances in Neural Information Processing Systems (NeurIPS), 35: 0 33890--33903, 2022 b

  15. [15]

    An analysis of robustness of non- L ipschitz networks

    Maria-Florina Balcan, Avrim Blum, Dravyansh Sharma, and Hongyang Zhang. An analysis of robustness of non- L ipschitz networks. Journal of Machine Learning Research, 24 0 (98): 0 1--43, 2023 a

  16. [16]

    New bounds for hyperparameter tuning of regression problems across instances

    Maria-Florina Balcan, Anh Nguyen, and Dravyansh Sharma. New bounds for hyperparameter tuning of regression problems across instances. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors, Advances in Neural Information Processing Systems, volume 36, pages 80066--80078. Curran Associates, Inc., 2023 b

  17. [17]

    Generalization guarantees for multi-item profit maximization: Pricing, auctions, and randomized mechanisms

    Maria-Florina Balcan, Tuomas Sandholm, and Ellen Vitercik. Generalization guarantees for multi-item profit maximization: Pricing, auctions, and randomized mechanisms. Operations Research, 73 0 (2): 0 648--663, 2023 c

  18. [18]

    Learning to branch: Generalization guarantees and limits of data-independent discretization

    Maria-Florina Balcan, Travis Dick, Tuomas Sandholm, and Ellen Vitercik. Learning to branch: Generalization guarantees and limits of data-independent discretization. Journal of the ACM (JACM), 2024 a

  19. [19]

    Accelerating ERM for data-driven algorithm design using output-sensitive techniques

    Maria-Florina Balcan, Christopher Seiler, and Dravyansh Sharma. Accelerating ERM for data-driven algorithm design using output-sensitive techniques. Advances in Neural Information Processing Systems, 37: 0 72648--72687, 2024 b

  20. [20]

    Algorithm configuration for structured P faffian settings

    Maria-Florina Balcan, Anh Tuan Nguyen, and Dravyansh Sharma. Algorithm configuration for structured P faffian settings. Transactions of Machine Learning Research (TMLR), 2025 a

  21. [21]

    Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function

    Maria-Florina Balcan, Anh Tuan Nguyen, and Dravyansh Sharma. Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function. arXiv preprint arXiv:2501.13734, 2025 b

  22. [22]

    Local R ademacher complexities

    Peter Bartlett, Olivier Bousquet, and Shahar Mendelson. Local R ademacher complexities. The Annals of Statistics, 33 0 (4), August 2005. ISSN 0090-5364. doi:10.1214/009053605000000282

  23. [23]

    Generalization bounds for data-driven numerical linear algebra

    Peter Bartlett, Piotr Indyk, and Tal Wagner. Generalization bounds for data-driven numerical linear algebra. In Conference on Learning Theory (COLT), pages 2013--2040. PMLR, 2022

  24. [24]

    Learning complexity of simulated annealing

    Avrim Blum, Chen Dan, and Saeed Seddighin. Learning complexity of simulated annealing. In International Conference on Artificial Intelligence and Statistics (AISTATS) , pages 1540--1548. PMLR, 2021

  25. [25]

    Learning cut generating functions for integer programming

    Hongyu Cheng and Amitabh Basu. Learning cut generating functions for integer programming. Advances in Neural Information Processing Systems, 37: 0 61455--61480, 2024

  26. [26]

    Tuning algorithmic and architectural hyperparameters in graph-based semi-supervised learning with provable guarantees

    Ally Yalei Du, Eric Huang, and Dravyansh Sharma. Tuning algorithmic and architectural hyperparameters in graph-based semi-supervised learning with provable guarantees. In The 41st Conference on Uncertainty in Artificial Intelligence, 2025

  27. [27]

    Two Modeling Strategies for Empirical Bayes Estimation

    Bradley Efron. Two Modeling Strategies for Empirical Bayes Estimation . Statistical Science, 29 0 (2): 0 285 -- 301, 2014. doi:10.1214/13-STS455

  28. [28]

    Recovery of exact sparse representations in the presence of bounded noise

    Jean-Jacques Fuchs. Recovery of exact sparse representations in the presence of bounded noise. IEEE Transactions on Information Theory, 51 0 (10): 0 3601--3608, 2005

  29. [29]

    A simulation study of some ridge estimators

    Diane Galarneau Gibbons. A simulation study of some ridge estimators. Journal of the American Statistical Association, 76 0 (373): 0 131--139, 1981

  30. [30]

    Generalized cross-validation as a method for choosing a good ridge parameter

    Gene H Golub, Michael Heath, and Grace Wahba. Generalized cross-validation as a method for choosing a good ridge parameter. Technometrics, 21 0 (2): 0 215--223, 1979

  31. [31]

    The elements of statistical learning: data mining, inference, and prediction, volume 2

    Trevor Hastie, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009

  32. [32]

    On the R ademacher series

    Pawe Hitczenko and Stanis aw Kwapie \' n . On the R ademacher series. In J rgen Hoffmann-J rgensen, James Kuelbs, and Michael B. Marcus, editors, Probability in Banach Spaces, 9, pages 31--36, Boston, MA, 1994. Birkh \"a user Boston. ISBN 978-1-4612-0253-0

  33. [33]

    Ridge regression: applications to nonorthogonal problems

    Arthur E Hoerl and Robert W Kennard. Ridge regression: applications to nonorthogonal problems. Technometrics, 12 0 (1): 0 69--82, 1970

  34. [34]

    Teh, and Yihong Wu

    Soham Jana, Yury Polyanskiy, Anzo Z. Teh, and Yihong Wu. Empirical Bayes via ERM and Rademacher complexities: the P oisson model. In Gergely Neu and Lorenzo Rosasco, editors, Proceedings of Thirty Sixth Conference on Learning Theory, volume 195 of Proceedings of Machine Learning Research, pages 5199--5235. PMLR, 12--15 Jul 2023

  35. [35]

    Sample complexity of posted pricing for a single item

    Billy Jin, Thomas Kesselheim, Will Ma, and Sahil Singla. Sample complexity of posted pricing for a single item. In Advances in Neural Information Processing Systems, 2024

  36. [36]

    Learning to relax: Setting solver parameters across a sequence of linear system instances

    Mikhail Khodak, Edmond Chow, Maria-Florina Balcan, and Ameet Talwalkar. Learning to relax: Setting solver parameters across a sequence of linear system instances. The Twelfth International Conference on Learning Representations (ICLR), 2024

  37. [37]

    A flexible empirical B ayes approach to multiple linear regression and connections with penalized regression

    Youngseok Kim, Wei Wang, Peter Carbonetto, and Matthew Stephens. A flexible empirical B ayes approach to multiple linear regression and connections with penalized regression. Journal of Machine Learning Research, 25 0 (185): 0 1--59, 2024

  38. [38]

    Concentration in unbounded metric spaces and algorithmic stability

    Aryeh Kontorovich. Concentration in unbounded metric spaces and algorithmic stability. In Eric P. Xing and Tony Jebara, editors, Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 28--36, Bejing, China, 22--24 Jun 2014. PMLR

  39. [39]

    Learning with square loss: Localization through offset rademacher complexity

    Tengyuan Liang, Alexander Rakhlin, and Karthik Sridharan. Learning with square loss: Localization through offset rademacher complexity. In Peter Grünwald, Elad Hazan, and Satyen Kale, editors, Proceedings of The 28th Conference on Learning Theory, volume 40 of Proceedings of Machine Learning Research, pages 1260--1285, Paris, France, 03--06 Jul 2015. PMLR

  40. [40]

    Complexity analysis of the lasso regularization path

    Julien Mairal and Bin Yu. Complexity analysis of the lasso regularization path. International Conference on Machine Learning, 2012

  41. [41]

    The benefit of multitask representation learning

    Andreas Maurer, Massimiliano Pontil, and Bernardino Romera-Paredes. The benefit of multitask representation learning. Journal of Machine Learning Research, 17 0 (81): 0 1--32, 2016

  42. [42]

    Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices

    Jaouad Mourtada. Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices . The Annals of Statistics, 50 0 (4): 0 2157 -- 2178, 2022. doi:10.1214/22-AOS2181

  43. [43]

    The B ayesian lasso

    Trevor Park and George Casella. The B ayesian lasso. Journal of the American Statistical Association, 103 0 (482): 0 681--686, 2008. doi:10.1198/016214508000000337

  44. [44]

    Excess risk bounds for multitask learning with trace norm regularization

    Massimiliano Pontil and Andreas Maurer. Excess risk bounds for multitask learning with trace norm regularization. In Shai Shalev-Shwartz and Ingo Steinwart, editors, Proceedings of the 26th Annual Conference on Learning Theory, volume 30 of Proceedings of Machine Learning Research, pages 55--76, Princeton, NJ, USA, 12--14 Jun 2013. PMLR

  45. [45]

    Samuel Robbins, Herbert E. (ed. Kotz and Norman L.) Johnson. An Empirical B ayes Approach to Statistics , pages 388--394. Springer New York, New York, NY, 1992. doi:10.1007/978-1-4612-0919-5_26

  46. [46]

    More PAC-B ayes bounds: From bounded losses, to losses with general tail behaviors, to anytime validity

    Borja Rodr \'i guez-G \'a lvez, Ragnar Thobaben, and Mikael Skoglund. More PAC-B ayes bounds: From bounded losses, to losses with general tail behaviors, to anytime validity. Journal of Machine Learning Research, 25 0 (110): 0 1--43, 2024

  47. [47]

    Generalization bound and learning methods for data-driven projections in linear programming

    Shinsaku Sakaue and Taihei Oki. Generalization bound and learning methods for data-driven projections in linear programming. Advances in Neural Information Processing Systems, 37: 0 12825--12846, 2024

  48. [48]

    Understanding Machine Learning: From Theory to Algorithms

    Shai Shalev-Shwartz and Shai Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, USA, 2014. ISBN 1107057132

  49. [49]

    No internal regret with non-convex loss functions

    Dravyansh Sharma. No internal regret with non-convex loss functions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 14919--14927, 2024

  50. [50]

    Efficiently learning the graph for semi-supervised learning

    Dravyansh Sharma and Maxwell Jones. Efficiently learning the graph for semi-supervised learning. In Uncertainty in Artificial Intelligence, pages 1900--1910. PMLR, 2023

  51. [51]

    Offline-to-online hyperparameter transfer for stochastic bandits

    Dravyansh Sharma and Arun Suggala. Offline-to-online hyperparameter transfer for stochastic bandits. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 20362--20370, 2025

  52. [52]

    Empirical B ayes estimation: When does g -modeling beat f -modeling in theory (and in practice)?, 2024

    Yandi Shen and Yihong Wu. Empirical B ayes estimation: When does g -modeling beat f -modeling in theory (and in practice)?, 2024

  53. [53]

    The lasso problem and uniqueness

    Ryan J Tibshirani. The lasso problem and uniqueness. Electronic Journal of statistics, 7: 0 1456--1490, 2013

  54. [54]

    Solutions of ill-posed problems

    Andrey Nikolayevich Tikhonov. Solutions of ill-posed problems. VH Winston and Sons, 1977

  55. [55]

    van Wieringen

    Wessel N. van Wieringen. Lecture notes on ridge regression. arXiv preprint arXiv:2410.10572, 2023

  56. [56]

    High-dimensional probability: An introduction with applications in data science, volume 47

    Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018

  57. [57]

    All of Statistics

    Larry Wasserman. All of Statistics. Springer, New York, 2010

  58. [58]

    The convergence rates of empirical B ayes estimation in a multiple linear regression model

    Laisheng Wei and Shunpu Zhang. The convergence rates of empirical B ayes estimation in a multiple linear regression model. Annals of the Institute of Statistical Mathematics, 47 0 (1): 0 81--97, January 1995. ISSN 0020-3157. doi:10.1007/BF00773413

  59. [59]

    Lower bounds on the smallest eigenvalue of a sample covariance matrix

    Pavel Yaskov. Lower bounds on the smallest eigenvalue of a sample covariance matrix. Electronic Communications in Probability, 19 0 (none): 0 1 -- 10, 2014. doi:10.1214/ECP.v19-3807

  60. [60]

    The superiority of empirical B ayes estimator of parameters in linear model

    Weiping Zhang, Laisheng Wei, and Yaning Yang. The superiority of empirical B ayes estimator of parameters in linear model. Statistics & Probability Letters, 72 0 (1): 0 43--50, 2005. ISSN 0167-7152. doi:https://doi.org/10.1016/j.spl.2004.12.001