Penalized Likelihood Methods for Modeling Count Data

Akihito Kamata; Cornelis J. Potgieter; Minh Thu Bui

arxiv: 2109.14010 · v4 · submitted 2021-09-28 · 📊 stat.ME · stat.AP

Penalized Likelihood Methods for Modeling Count Data

Minh Thu Bui , Cornelis J. Potgieter , Akihito Kamata This is my paper

Pith reviewed 2026-05-24 13:44 UTC · model grok-4.3

classification 📊 stat.ME stat.AP

keywords penalized likelihoodcount data modelsbinomial distributionzero-inflated modelsbeta-binomialoral reading fluencypassage difficulty estimationmean squared error

0 comments

The pith

Penalized likelihood methods produce large reductions in mean squared error for estimating passage difficulty in count data models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to improve estimation of passage difficulty in oral reading fluency assessments by using penalized likelihood methods on count data models for words read incorrectly. Three models are considered: the binomial, zero-inflated binomial, and beta-binomial, each with parameters that define passage difficulty. Two penalty types are introduced to shrink estimates either toward zero or toward each other. Simulations demonstrate substantial decreases in mean squared error compared to standard maximum likelihood estimation when sample sizes per passage are moderate. This matters because better parameter estimates lead to more reliable measurement of reading skills in children using data from ten different passages.

Core claim

The central claim is that penalized likelihood estimation for the binomial, zero-inflated binomial, and beta-binomial models applied to words read incorrectly scores yields big reductions in mean squared error for passage difficulty parameters relative to unpenalized maximum likelihood, as shown by simulation and then applied to the motivating oral reading fluency dataset from fourth-grade students.

What carries the argument

Penalized likelihood estimation using penalties that shrink model parameters closer to zero or closer to one another, to efficiently estimate passage difficulty as a function of the underlying parameters in the count models.

If this is right

Penalized estimates achieve lower mean squared error than maximum likelihood in the simulation study.
The methods are applied to the real ORF data for improved analysis.
Both penalty functions serve distinct goals in regularization of the parameter estimates.
Moderate sample sizes per passage benefit from the shrinkage approach.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The penalized approach may extend usefully to other grouped count data problems where parameters vary across groups like passages.
Exploring the impact of different penalty parameters could optimize performance further in similar settings.
Such methods might enhance fairness in educational assessments by providing more stable difficulty estimates.

Load-bearing premise

The simulation design and penalty choices accurately capture the dependence structure and variability present in the real oral reading fluency count data with moderate per-passage sample sizes.

What would settle it

Finding no meaningful reduction in mean squared error, or even higher error, in a new simulation or dataset with similar structure and sample sizes would indicate the penalized methods do not deliver the claimed improvements.

Figures

Figures reproduced from arXiv: 2109.14010 by Akihito Kamata, Cornelis J. Potgieter, Minh Thu Bui.

**Figure 1.** Figure 1: L1 norm (left) and L2 norm (right) penalty functions for J = 2 binomial success probabilities. Note that as Pen2(p) ≤ Pen1(p) for all p ∈ [0, 1]I , the L1 norm will more aggressively shrink success probabilities to 0 than the L2 norm. Due to the resemblance of the L1 norm to the commonly-used lasso penalty in regression, it should be pointed out that its application here will not result in shrinkage estim… view at source ↗

**Figure 2.** Figure 2: Penalties Pen3 (left) and Pen4 (right) penalty functions, respectively unbounded from above and below, for I = 2 binomial success probabilities. log(lambda) Penalized Estimator Penalty 1 Penalty 2 Penalty 3 Penalty 4 [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Schematic representation of four different penalized estimators shrinking ˜p closer to 0. All four of the penalized solutions above corresponding to some notion of success probabilities being “close to 0” or “not too large.” [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: When considering shrinkage to 0, we chose scaling parameters ( [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 4.** Figure 4: Success probability distributions considered in the simulation study. Summarized in the tables below are the Monte Carlo estimates of the MSE ratios. For the kth sample Xk, let pk = (pk,1, . . . , pk,10) denote the true success probabilities simulated from a specified scaled Beta distribution. Let pˆk denote the MLE and let p˜k denote a penalized estimator found using VFCV. Define Sum of Squared Deviations… view at source ↗

**Figure 5.** Figure 5: WRI proportions for the ten passages [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

**Figure 6.** Figure 6: Beta-binomial parameter estimates under mean shrinkage and full shrinkage. Dashed line indicates optimal shrinkage. Scale value to improve full shrinkage plot readability is ε = e−10 . For the interested reader, [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Empirical and penalized model-based pmf and cdf comparisons for the Passage 2 data. 6. Conclusions The goal of this project was defining and exploring penalized parameter estimators of passage difficulty from independent multivariate count data. WRI scores realized by 508 students during an ORF assessment motivated the work and these data were analyzed in Section 5. The simulation results presented show th… view at source ↗

read the original abstract

The paper considers parameter estimation in count data models using penalized likelihood methods. The motivating data consists of multiple independent count variables with a moderate sample size per variable. The data were collected during the assessment of oral reading fluency (ORF) in school-aged children. A sample of fourth-grade students were given one of ten available passages to read with these differing in length and difficulty. The observed number of words read incorrectly (WRI) is used to measure ORF. Three models are considered for WRI scores, namely the binomial, the zero-inflated binomial, and the beta-binomial. We aim to efficiently estimate passage difficulty, a quantity expressed as a function of the underlying model parameters. Two types of penalty functions are considered for penalized likelihood with respective goals of shrinking parameter estimates closer to zero or closer to one another. A simulation study evaluates the efficacy of the shrinkage estimates using Mean Square Error (MSE) as metric. Big reductions in MSE relative to unpenalized maximum likelihood are observed. The paper concludes with an analysis of the motivating ORF data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Applies standard penalized likelihood to binomial count models for ORF passage difficulty and reports MSE gains in simulations, but the sim design may not match the real data's passage heterogeneity.

read the letter

The paper takes penalized likelihood, which is already standard, and applies it to binomial, zero-inflated binomial, and beta-binomial models to estimate passage difficulty from oral reading fluency counts. The setup has ten passages that vary in length and difficulty, with moderate samples per passage, and the goal is to shrink the parameter estimates either toward zero or toward each other. Simulations show clear MSE reductions compared with ordinary maximum likelihood, which is the main empirical result they highlight. That part is straightforward and directly addresses the motivating data problem. The real-data analysis at the end gives a concrete illustration of how the method works on the ORF scores. The soft spot is the simulation design. The real passages differ in trial counts and difficulty levels, so if the generated data use fixed or overly homogeneous trial sizes and parameter spreads, the reported MSE gains could overstate what happens with the actual structure. The abstract does not spell out the data-generation details, which leaves that link unverified. The penalty tuning parameters are free and need to be chosen, but that is expected. This is useful for people working on count-data estimation in educational assessment or similar applied settings where sample sizes per unit are moderate and some shrinkage makes sense. It is not a new theoretical framework, so the contribution is the targeted application and the numerical comparison. I would send it to peer review because the motivation is clear, the models are standard, and the claim is falsifiable once the simulation details are checked. A referee can ask for more on how the sims replicate the varying passage lengths.

Referee Report

3 major / 2 minor

Summary. The manuscript develops penalized likelihood methods for parameter estimation in binomial, zero-inflated binomial, and beta-binomial models for count data. Motivated by oral reading fluency (ORF) data from 10 passages of varying lengths and difficulties administered to fourth-grade students, the paper considers two penalty types—one shrinking estimates toward zero and one shrinking them toward each other across passages—to improve estimation of passage difficulty (a function of model parameters) under moderate per-passage sample sizes. A simulation study reports large MSE reductions relative to unpenalized maximum likelihood, followed by an application to the real ORF data.

Significance. If the reported MSE gains hold under data-generating processes that match the real-data heterogeneity in passage lengths and difficulties, the penalized estimators would offer a practical way to borrow strength across multiple count variables with limited samples per variable. This could be useful in educational assessment and other settings with grouped count data. The manuscript does not provide machine-checked proofs or reproducible code, but the simulation-based evaluation of two distinct penalty goals is a clear strength if the design is shown to be realistic.

major comments (3)

[Simulation study] Simulation study section: The data-generating process for the simulation is not described with respect to heterogeneity in the number of trials (passage lengths). The real ORF data involve 10 passages that differ in length and difficulty with moderate per-passage sample sizes; if the simulations instead use fixed or homogeneous trial sizes, the observed MSE reductions cannot be taken as evidence that the penalized estimators will improve estimation of passage difficulty in the motivating application.
[§3] Penalty formulation (throughout §3): The 'shrink to each other' penalty is introduced with the goal of shrinking parameter estimates closer to one another, but the exact functional form (including how the distance across the 10 passages is measured and how the tuning parameter is selected) is not stated as an explicit equation. This prevents verification that the penalty is well-defined for the binomial/ZIB/beta-binomial parameters and that the reported MSE gains are not an artifact of a particular tuning choice.
[Abstract] Abstract and simulation results: The claim of 'big reductions in MSE' is presented without any numerical values, tables, or figures in the abstract and without equations for the penalized objective or the MSE metric. Because the central empirical claim rests entirely on these unreported simulation details, the strength of evidence for the method's efficacy cannot be assessed from the provided information.

minor comments (2)

[Abstract] The abstract states that three models are considered but does not list the specific parameterizations (e.g., link functions or zero-inflation probability) used for passage difficulty; adding one sentence would improve clarity.
[§3] Notation for the penalty tuning parameters should be introduced once and used consistently; the current description leaves the reader to infer whether a single tuning parameter or separate ones per model are employed.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important areas for clarification. We address each major comment below and will revise the manuscript accordingly where needed to improve transparency and alignment with the motivating application.

read point-by-point responses

Referee: [Simulation study] Simulation study section: The data-generating process for the simulation is not described with respect to heterogeneity in the number of trials (passage lengths). The real ORF data involve 10 passages that differ in length and difficulty with moderate per-passage sample sizes; if the simulations instead use fixed or homogeneous trial sizes, the observed MSE reductions cannot be taken as evidence that the penalized estimators will improve estimation of passage difficulty in the motivating application.

Authors: We agree that explicit description of heterogeneity in trial sizes is essential for linking the simulations to the real ORF data. In the revised version, we will expand the simulation study section to detail the data-generating process, including sampling passage lengths from the empirical distribution of the 10 passages in the motivating dataset (with their varying difficulties and lengths) while maintaining moderate per-passage sample sizes. This will ensure the MSE comparisons directly support applicability to the application. revision: yes
Referee: [§3] Penalty formulation (throughout §3): The 'shrink to each other' penalty is introduced with the goal of shrinking parameter estimates closer to one another, but the exact functional form (including how the distance across the 10 passages is measured and how the tuning parameter is selected) is not stated as an explicit equation. This prevents verification that the penalty is well-defined for the binomial/ZIB/beta-binomial parameters and that the reported MSE gains are not an artifact of a particular tuning choice.

Authors: We will revise §3 to include the explicit mathematical formulations for both penalty types as equations. This will specify the distance metric (e.g., a sum of squared differences or L2 norm across the 10 passage-specific parameters) and the procedure for selecting the tuning parameter (via cross-validation or an information criterion adapted for the penalized likelihood). These additions will confirm the penalties are well-defined for the binomial, ZIB, and beta-binomial models. revision: yes
Referee: [Abstract] Abstract and simulation results: The claim of 'big reductions in MSE' is presented without any numerical values, tables, or figures in the abstract and without equations for the penalized objective or the MSE metric. Because the central empirical claim rests entirely on these unreported simulation details, the strength of evidence for the method's efficacy cannot be assessed from the provided information.

Authors: The abstract has strict length constraints that preclude equations, tables, or detailed metrics, which are instead provided in the body (penalized objective in §3, MSE definition and results in §4). To strengthen the abstract, we will revise it to include a concise quantitative statement on the MSE reductions (e.g., reporting approximate percentage improvements from the simulations) while maintaining brevity, and we will ensure the simulation section explicitly defines the MSE metric. revision: partial

Circularity Check

0 steps flagged

No significant circularity; simulation-based evaluation is independent of fitted parameters

full rationale

The paper defines penalized likelihood estimators for binomial/ZIB/beta-binomial models on count data, applies two penalty types (shrink-to-zero or shrink-to-each-other), and evaluates them via separate simulation studies that generate data under the models and compute MSE against known true parameters. These MSE comparisons are external to any single fit and do not reduce by construction to the penalized estimates themselves. The real-data ORF analysis is presented as an application after the simulations; no self-citation chain, uniqueness theorem, or ansatz smuggling is invoked to justify the central claims. The derivation and evaluation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The approach rests on the standard likelihoods of the binomial, zero-inflated binomial, and beta-binomial distributions plus the choice of two penalty functions whose tuning parameters are not described in the abstract.

free parameters (1)

penalty tuning parameters
Strength of shrinkage penalties must be chosen or tuned; abstract does not specify how this is done.

pith-pipeline@v0.9.0 · 5716 in / 974 out tokens · 23065 ms · 2026-05-24T13:44:52.528425+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Two types of penalty functions are considered for penalized likelihood with respective goals of shrinking parameter estimates closer to zero or closer to one another... PenL2(p) = sum_i sum_j (p_i - p_j)^2
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A simulation study evaluates the efficacy of the shrinkage estimates using Mean Square Error (MSE) as metric. Big reductions in MSE relative to unpenalized maximum likelihood are observed.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

and Hitchcock, D

Agresti, A. and Hitchcock, D. B. (2005). Bayesian inference for categorical data analysis. Statistical Methods and Applications , 14(3):297--330

work page 2005
[2]

Allington, R. L. (1983). Fluency: The neglected reading goal. The reading teacher , 36(6):556--561

work page 1983
[3]

and Celisse, A

Arlot, S. and Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics surveys , 4:40--79

work page 2010
[4]

and Abu Dayyeh, W

Baklizi, A. and Abu Dayyeh, W. (2003). Shrinkage estimation of p (y< x) in the exponential case. Communications in Statistics-Simulation and Computation , 32(1):31--42

work page 2003
[5]

and Lehmann, E

Chernoff, H. and Lehmann, E. (1954). The use of maximum likelihood estimates in 2 tests for goodness of fit. The Annals of Mathematical Statistics , pages 579--586

work page 1954
[6]

and Dunson, D

Datta, J. and Dunson, D. B. (2016). Bayesian inference on quasi-sparse count data. Biometrika , 103(4):971--983

work page 2016
[7]

and Rasinski, T

DiSalle, K. and Rasinski, T. (2017). Impact of short-term intense fluency instruction on students’ reading achievement: A classroom-based, teacher-initiated research study. Journal of Teacher Action Research , 3(2):1--13

work page 2017
[8]

and Morris, C

Efron, B. and Morris, C. (1973). Stein's estimation rule and its competitors—an empirical bayes approach. Journal of the American Statistical Association , 68(341):117--130

work page 1973
[9]

S., Fuchs, D., Hosp, M

Fuchs, L. S., Fuchs, D., Hosp, M. K., and Jenkins, J. R. (2001). Oral reading fluency as an indicator of reading competence: A theoretical, empirical, and historical analysis. Scientific studies of reading , 5(3):239--256

work page 2001
[10]

Geisser, S. (1975). The predictive sample reuse method with applications. Journal of the American statistical Association , 70(350):320--328

work page 1975
[11]

Gruber, M. H. (2017). Improving efficiency by shrinkage: the James-Stein and ridge regression estimators . Routledge

work page 2017
[12]

Hansen, B. E. (2016). Efficient shrinkage in parametric models. Journal of Econometrics , 190(1):115--132

work page 2016
[13]

and Tindal, G

Hasbrouck, J. and Tindal, G. A. (2006). Oral reading fluency norms: A valuable assessment tool for reading teachers. The reading teacher , 59(7):636--644

work page 2006
[14]

Hastie, T., Tibshirani, R., and Wainwright, M. (2019). Statistical learning with sparsity: the lasso and generalizations . Chapman and Hall/CRC

work page 2019
[15]

Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics , 12(1):55--67

work page 1970
[16]

Jani, P. (1991). A class of shrinkage estimators for the scale parameter of the exponential distribution. IEEE Transactions on Reliability , 40(1):68--70

work page 1991
[17]

Johns, J. L. and Lunn, M. K. (1983). The informal reading inventory: 1910--1980. Literacy Research and Instruction , 23(1):8--19

work page 1983
[18]

Lemmer, H. (1981a). From ordinary to bayesian shrinkage estimators. South African Statistical Journal , 15(1):57--72

work page
[19]

Lemmer, H. (1981b). Note on shrinkage estimators for the binomial distribution. Communications in statistics-theory and methods , 10(10):1017--1027

work page
[20]

M nsson, K. (2013). Developing a liu estimator for the negative binomial regression model: method and application. Journal of Statistical Computation and Simulation , 83(9):1773--1780

work page 2013
[21]

I., Tich \'a , R., and Espin, C

Miura Wayman, M., Wallace, T., Wiley, H. I., Tich \'a , R., and Espin, C. A. (2007). Literature synthesis on curriculum-based measurement in reading. The Journal of Special Education , 41(2):85--120

work page 2007
[22]

and Upadhyay, S

Pandey, M. and Upadhyay, S. (1985). Bayes shrinkage estimators of weibull parameters. IEEE transactions on reliability , 34(5):491--494

work page 1985
[23]

and Casella, G

Park, T. and Casella, G. (2008). The bayesian lasso. Journal of the American Statistical Association , 103(482):681--686

work page 2008
[24]

Polson, N. G. and Sokolov, V. (2019). Bayesian regularization: From tikhonov to horseshoe. Wiley Interdisciplinary Reviews: Computational Statistics , 11(4):e1463

work page 2019
[25]

Qasim, M., Kibria, B., M nsson, K., and Sj \"o lander, P. (2020). A new poisson liu regression estimator: method and application. Journal of Applied Statistics , 47(12):2258--2271

work page 2020
[26]

Samuels, S. J. (1988). Decoding and automaticity: Helping poor readers become automatic at word recognition. The reading teacher , 41(8):756--760

work page 1988
[27]

Schreiber, P. A. (1991). Understanding prosody's role in reading acquisition. Theory into practice , 30(3):158--164

work page 1991
[28]

R., Knutson, N., Good III, R

Shinn, M. R., Knutson, N., Good III, R. H., Tilly III, W. D., and Collins, V. L. (1992). Curriculum-based measurement of oral reading fluency: A confirmatory analysis of its relation to reading. School Psychology Review , 21(3):459--479

work page 1992
[29]

Stein, C. et al. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics . The Regents of the University of California

work page 1956
[30]

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) , 58(1):267--288

work page 1996
[31]

Zandi, Z., Bevrani, H., and Arabi Belaghi, R. (2021). Using shrinkage strategies to estimate fixed effects in zero-inflated negative binomial mixed model. Communications in Statistics-Simulation and Computation , pages 1--22

work page 2021

[1] [1]

and Hitchcock, D

Agresti, A. and Hitchcock, D. B. (2005). Bayesian inference for categorical data analysis. Statistical Methods and Applications , 14(3):297--330

work page 2005

[2] [2]

Allington, R. L. (1983). Fluency: The neglected reading goal. The reading teacher , 36(6):556--561

work page 1983

[3] [3]

and Celisse, A

Arlot, S. and Celisse, A. (2010). A survey of cross-validation procedures for model selection. Statistics surveys , 4:40--79

work page 2010

[4] [4]

and Abu Dayyeh, W

Baklizi, A. and Abu Dayyeh, W. (2003). Shrinkage estimation of p (y< x) in the exponential case. Communications in Statistics-Simulation and Computation , 32(1):31--42

work page 2003

[5] [5]

and Lehmann, E

Chernoff, H. and Lehmann, E. (1954). The use of maximum likelihood estimates in 2 tests for goodness of fit. The Annals of Mathematical Statistics , pages 579--586

work page 1954

[6] [6]

and Dunson, D

Datta, J. and Dunson, D. B. (2016). Bayesian inference on quasi-sparse count data. Biometrika , 103(4):971--983

work page 2016

[7] [7]

and Rasinski, T

DiSalle, K. and Rasinski, T. (2017). Impact of short-term intense fluency instruction on students’ reading achievement: A classroom-based, teacher-initiated research study. Journal of Teacher Action Research , 3(2):1--13

work page 2017

[8] [8]

and Morris, C

Efron, B. and Morris, C. (1973). Stein's estimation rule and its competitors—an empirical bayes approach. Journal of the American Statistical Association , 68(341):117--130

work page 1973

[9] [9]

S., Fuchs, D., Hosp, M

Fuchs, L. S., Fuchs, D., Hosp, M. K., and Jenkins, J. R. (2001). Oral reading fluency as an indicator of reading competence: A theoretical, empirical, and historical analysis. Scientific studies of reading , 5(3):239--256

work page 2001

[10] [10]

Geisser, S. (1975). The predictive sample reuse method with applications. Journal of the American statistical Association , 70(350):320--328

work page 1975

[11] [11]

Gruber, M. H. (2017). Improving efficiency by shrinkage: the James-Stein and ridge regression estimators . Routledge

work page 2017

[12] [12]

Hansen, B. E. (2016). Efficient shrinkage in parametric models. Journal of Econometrics , 190(1):115--132

work page 2016

[13] [13]

and Tindal, G

Hasbrouck, J. and Tindal, G. A. (2006). Oral reading fluency norms: A valuable assessment tool for reading teachers. The reading teacher , 59(7):636--644

work page 2006

[14] [14]

Hastie, T., Tibshirani, R., and Wainwright, M. (2019). Statistical learning with sparsity: the lasso and generalizations . Chapman and Hall/CRC

work page 2019

[15] [15]

Hoerl, A. E. and Kennard, R. W. (1970). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics , 12(1):55--67

work page 1970

[16] [16]

Jani, P. (1991). A class of shrinkage estimators for the scale parameter of the exponential distribution. IEEE Transactions on Reliability , 40(1):68--70

work page 1991

[17] [17]

Johns, J. L. and Lunn, M. K. (1983). The informal reading inventory: 1910--1980. Literacy Research and Instruction , 23(1):8--19

work page 1983

[18] [18]

Lemmer, H. (1981a). From ordinary to bayesian shrinkage estimators. South African Statistical Journal , 15(1):57--72

work page

[19] [19]

Lemmer, H. (1981b). Note on shrinkage estimators for the binomial distribution. Communications in statistics-theory and methods , 10(10):1017--1027

work page

[20] [20]

M nsson, K. (2013). Developing a liu estimator for the negative binomial regression model: method and application. Journal of Statistical Computation and Simulation , 83(9):1773--1780

work page 2013

[21] [21]

I., Tich \'a , R., and Espin, C

Miura Wayman, M., Wallace, T., Wiley, H. I., Tich \'a , R., and Espin, C. A. (2007). Literature synthesis on curriculum-based measurement in reading. The Journal of Special Education , 41(2):85--120

work page 2007

[22] [22]

and Upadhyay, S

Pandey, M. and Upadhyay, S. (1985). Bayes shrinkage estimators of weibull parameters. IEEE transactions on reliability , 34(5):491--494

work page 1985

[23] [23]

and Casella, G

Park, T. and Casella, G. (2008). The bayesian lasso. Journal of the American Statistical Association , 103(482):681--686

work page 2008

[24] [24]

Polson, N. G. and Sokolov, V. (2019). Bayesian regularization: From tikhonov to horseshoe. Wiley Interdisciplinary Reviews: Computational Statistics , 11(4):e1463

work page 2019

[25] [25]

Qasim, M., Kibria, B., M nsson, K., and Sj \"o lander, P. (2020). A new poisson liu regression estimator: method and application. Journal of Applied Statistics , 47(12):2258--2271

work page 2020

[26] [26]

Samuels, S. J. (1988). Decoding and automaticity: Helping poor readers become automatic at word recognition. The reading teacher , 41(8):756--760

work page 1988

[27] [27]

Schreiber, P. A. (1991). Understanding prosody's role in reading acquisition. Theory into practice , 30(3):158--164

work page 1991

[28] [28]

R., Knutson, N., Good III, R

Shinn, M. R., Knutson, N., Good III, R. H., Tilly III, W. D., and Collins, V. L. (1992). Curriculum-based measurement of oral reading fluency: A confirmatory analysis of its relation to reading. School Psychology Review , 21(3):459--479

work page 1992

[29] [29]

Stein, C. et al. (1956). Inadmissibility of the usual estimator for the mean of a multivariate normal distribution. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics . The Regents of the University of California

work page 1956

[30] [30]

Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological) , 58(1):267--288

work page 1996

[31] [31]

Zandi, Z., Bevrani, H., and Arabi Belaghi, R. (2021). Using shrinkage strategies to estimate fixed effects in zero-inflated negative binomial mixed model. Communications in Statistics-Simulation and Computation , pages 1--22

work page 2021