From Poisson Observations to Fitted Negative Binomial Distribution
Pith reviewed 2026-05-24 02:39 UTC · model grok-4.3
The pith
When data is generated by a Poisson process, a newly parameterized negative binomial distribution recovers the true Poisson parameters consistently.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors introduce an extended negative binomial distribution via a new parameterization that encompasses the Poisson distribution as a limit case. They provide a new maximum likelihood estimation algorithm and prove that, under Poisson observations, the estimated parameters of this extended distribution converge to those of the true Poisson distribution.
What carries the argument
The extended negative binomial distribution with a new parameterization that includes the Poisson as a special class, together with the associated maximum likelihood estimation algorithm.
Load-bearing premise
The new parameterization of the negative binomial includes the Poisson distribution as a special limit case and the maximum likelihood algorithm converges when the data is exactly Poisson.
What would settle it
Simulate data from a Poisson distribution with a known rate, apply the new estimation procedure, and check whether the fitted negative binomial parameters approach the Poisson values as sample size grows.
read the original abstract
The negative binomial distribution has been widely used as a more flexible model than the Poisson distribution for count data. However, when the true data-generating process is Poisson, it is often challenging to distinguish it from a negative binomial distribution with extreme parameter values, and existing maximum likelihood estimation procedures for the negative binomial distribution may fail or produce unstable estimates. To address this issue, we develop a new algorithm for computing the maximum likelihood estimate of negative binomial parameters, which is more efficient and more accurate than existing methods. We further extend negative binomial distributions with a new parameterization to cover Poisson distributions as a special class. We provide theoretical justifications showing that, when applied to a Poisson data, the estimated parameters of the extended negative binomial distribution can consistently recover the true Poisson distribution.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops a new algorithm for computing the maximum likelihood estimates of negative binomial parameters, claimed to be more efficient and accurate than existing methods. It introduces an extended parameterization of the negative binomial distribution that includes the Poisson distribution as a special (limit) case. Theoretical justifications are provided to show that, for Poisson-generated data, the fitted parameters of this extended model consistently recover the true Poisson distribution.
Significance. If the consistency result holds, the work provides a unified modeling framework that avoids instability in negative binomial fits to Poisson-like count data, which is a frequent practical issue. The new MLE algorithm and boundary-inclusive parameterization could improve robustness in statistical applications involving count data.
major comments (1)
- [Abstract] Abstract: The central consistency claim requires that the extended negative binomial parameterization includes the Poisson as the limit when the dispersion parameter tends to infinity. Standard MLE consistency theorems assume an interior point with positive definite Fisher information; the manuscript must supply an explicit boundary-case analysis (e.g., showing the dispersion estimator diverges to infinity in probability while the mean estimator converges to the true Poisson rate) rather than invoking interior-point results.
Simulated Author's Rebuttal
We thank the referee for the detailed review and for highlighting the need for a rigorous boundary analysis in the consistency proof. We address the major comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central consistency claim requires that the extended negative binomial parameterization includes the Poisson as the limit when the dispersion parameter tends to infinity. Standard MLE consistency theorems assume an interior point with positive definite Fisher information; the manuscript must supply an explicit boundary-case analysis (e.g., showing the dispersion estimator diverges to infinity in probability while the mean estimator converges to the true Poisson rate) rather than invoking interior-point results.
Authors: We agree that the consistency result for the boundary case (dispersion parameter tending to infinity) requires an explicit analysis beyond standard interior-point MLE theorems. In the revised manuscript we will add a dedicated subsection in the theoretical results section that establishes the required boundary behavior: under Poisson-generated data, the MLE of the dispersion parameter diverges to infinity in probability while the mean-parameter estimator converges in probability to the true Poisson rate. This will be proved directly from the extended parameterization and the form of the log-likelihood, without relying on interior-point assumptions. revision: yes
Circularity Check
No circularity; consistency claim offered as independent theoretical result
full rationale
The abstract describes an extension of the negative binomial parameterization that includes the Poisson distribution as a special (limit) case, followed by a separate claim of providing theoretical justifications for consistent MLE recovery of the true Poisson parameters on Poisson data. No quoted equations, self-citations, or steps in the given material reduce this consistency result to a fitted input by construction, a renamed known pattern, or a load-bearing self-citation chain. The derivation is presented as self-contained, with the theoretical justification treated as additional content rather than tautological with the parameterization choice itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard regularity conditions hold for the MLE to be consistent under the extended parameterization when data is Poisson.
invented entities (1)
-
Extended negative binomial parameterization
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Abramowitz, M. and I.A. Stegun. 1970.Handbook of Mathematical Functions with For- mulas, Graphs, and Mathematical Tables(9th ed.). Dover Publications: New York. Aldirawi, H. and J. Yang
work page 1970
-
[2]
Sampling theory of the negative binomial and logarithmic series distributions.Biometrika 37(3/4): 358–382 . Apostol, T.M. 1967.Calculus, Volume I(2 ed.). John Wiley & Sons. Bandara, U., R. Gill, and R. Mitra
work page 1967
-
[3]
A limited memory algorithm for bound constrained optimization.SIAM Journal on Scientific Computing 16(5): 1190–1208 . Cox, D. and E. Snell. 1989.Analysis of Binary Data(2 ed.). Chapman & Hall/CRC. de Jong, P. and G.Z. Heller. 2008.Generalized Linear Models for Insurance Data. Cam- bridge University Press. Delignette-Muller, M.L. and C. Dutang
work page 1989
-
[4]
Negative binomial distribution.Annals of Eugenics11: 182–187 . Fisz, M. 1963.Probability Theory and Mathematical Statistics(3th ed.). Wiley, New York. Forbes, C., M. Evans, N. Hastings, and B. Peacock. 2011.Statistical Distributions(4 ed.). John Wiley & Sons. Harvey, A.C. and J. Durbin
work page 1963
-
[5]
Canadian Journal of Statistics 24(1): 81–93
Empirical-distribution-function goodness-of-fit tests for discrete models. Canadian Journal of Statistics 24(1): 81–93 . Jackman, S., A. Tahk, A. Zeileis, C. Maimone, J. Fearon, and Z. Meers 2024.pscl: Political Science Computational Laboratory. R package version 1.5.9. Johnson, N.L., S. Kotz, and A.W. Kemp. 2005.Univariate Discrete Distributions(3rd ed.)...
work page 2024
-
[6]
On invariance and maximum likelihood estimation.The American Statistician 46(3): 209–212 . Pielou, E.C. 1977.Mathematical Ecology(2 ed.). John Wiley & Sons: New York. R Core Team 2025.R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Version 4.5.0. Resnick, S.I. 1999.A Probability Path. Birkh...
work page 1977
-
[7]
On the solution of a maximum-likelihood equation of the negative binomial distribution.Scandinavian Actuarial Journal 1976(4): 220–231 . Simonsen, W
work page 1976
-
[8]
On the solution of a maximum-likelihood equation of the negative binomial distribution
Correction to “On the solution of a maximum-likelihood equation of the negative binomial distribution”.Scandinavian Actuarial Journal 1980(1): 41–42 . Thall, P.F. and S.C. Vail
work page 1980
-
[9]
Some covariance models for longitudinal count data with over-dispersion.Biometrics46: 657–671 . Venables, W.N. and B.D. Ripley. 2002.Modern Applied Statistics with S(4 ed.). New York: Springer. Venzon, D. and S. Moolgavkar
work page 2002
-
[10]
Estimation problems for the two-parameter negative binomial distribu- tion.Statistics & Probability Letters 26(2): 113–114 . Wolny-Dominiak, A. and M. Trzesiok 2014.insuranceData: A Collection of Insurance Datasets Useful in Risk Classification in Non-life Insurance. R package version 1.0. Yu, Z., N. Dousti Mousavi, and J. Yang
work page 2014
-
[11]
2 +· · ·+ 1 (ν+y−1) 2 + ν ¯Yn ν+ ¯Yn , and thus lim ν→0+ ν2g′(ν) =− 1 n P y∈I\{0} fy = f0 n −1<0, then lim ν→0+ g′(ν) =−∞. S1 On the other hand, asν→ ∞, by applying Taylor’s theorem with the Peano form of the reminder (see, e.g., Section 7.9 in Apostol (1967)), we have 1 ν+k = 1 ν 1 + k ν = 1 ν − k ν2 +o(ν −2) log 1 + ¯Yn ν = ¯Yn ν − ¯Y 2 n 2ν2 +o(ν −2) 1...
work page 1967
-
[12]
With the constraintν∈(0, ν max], we must haveν ∗ >0
According to Lemma 2.1, lim ν→0+ g(ν) =∞. With the constraintν∈(0, ν max], we must haveν ∗ >0. S2 We claim thatg(ν ∗)>0. Actually, ifg(ν ∗)<0, then there exists aν ∗∗ ∈(0, ν ∗), such that,g(ν ∗∗) = 0, which violates min ν∈(0,νmax] g2(ν)>0. The factg(ν ∗)>0 impliesg(ν)≥g(ν ∗)>0 for allν∈(0, ν max]. That is,h(ν) increases all the time and thus the MLE ofνis...
work page 1999
-
[13]
=µ <∞. Then ∞X y=M+1 yfy n = ∞X y=M+1 #{1≤i≤n|Y i =y} ·y n = 1 n nX i=1 Yi1{Yi>M} − →E(Y 11{Y1>M}) almost surely, asn→ ∞. Then there exists an eventAwith probability 1, such that, for eachω∈A, fy(ω) n = #{1≤i≤n|Y i(ω) =y} n − →P(Y1 =y) (S.1) for eachy∈ {0,1,2, . . .}, ¯Yn(ω)→µand thus log 1 + ¯Yn(ω) ν →log 1 + µ ν , and ∞X y=M+1 yfy(ω) n →E(Y 11{Y1>M}) (S...
work page 2022
-
[14]
= Ψ(ν)−log(p), ifY 1 ∼NB(ν, p) (please note that the parameters in their paper are different from ours, withpreplaced with 1−p). Then according to (10), GN B(ν,p)(ν) =EΨ(ν+Y 1)−Ψ(ν)−log 1 + µ ν = Ψ(ν)−log(p)−Ψ(ν)−log 1 + 1−p p = 0, S5 whereµ=ν(1−p)/pfor NB(ν, p). In other words, GN B(ν,p)(ν) = ∞X y=0 1−F N B(ν,p)(y) ν+y −log 1 + µ ν = 0, whereF N B(ν,p)(y...
work page 1970
-
[15]
On the other hand, according to Lemma 1 of Minc and Sathre (1964) or Alzer (1997), 0<log Γ(x)− (x− 1
work page 1964
-
[16]
Recall the invariance property of MLE (see, e.g., Zehna (1966) or Pal and Berry (1992)), that is, given a functionτand an MLE ˆθofθ,τ( ˆθ) is always an MLE ofτ(θ). In this case,µ=ν(1−p)/pbuilds up a one-to-one correspondence from{(ν, p)|ν∈ (0, νmax], p∈(0,1)}to{(ν, µ)|ν∈(0, ν max], µ∈(0,∞)}, which will pass along the S9 uniqueness of MLE. According to the...
work page 1966
-
[17]
It supports the relevant statement in Section
Then for eachω∈A ∞, ˆp→1, asn→ ∞.□ S11 S2 A simulation study on using KS test for NB against Poisson distributions In this section, we use the Kolmogorov-Smirnov (KS) test (Massey Jr, 1951), a classical goodness-of-fit procedure, to illustrate that it is difficult to separate a Poisson law from a nearly equidispersed NB distribution. It supports the relev...
work page 1951
-
[18]
The original KS test was proposed to assess whether a given random sample{Y 1, . . . , Yn}originates from a continuous cumulative distribution function (CDF)F θ(y), char- acterized by specific model parameter(s)θ. The KS test statistic is defined asD n = supy |Fn(y)−F θ(y)|, whereF n(y) =n −1Pn i=1 1(−∞,y](Yi) represents the empirical distri- bution funct...
work page 2020
-
[19]
suggested replacingθwith its estimate ˆθwhen calculatingD n . Adjustedp-values or thresholds for test statistics can be estimated by parametric bootstrapping (Henze, 1996), nonpara- metric bootstrapping (Aldirawi et al., 2019), or nested bootstrapping (Dousti Mousavi et al.,
work page 1996
-
[20]
and AZIAD (Dousti Mousavi et al., 2023), have been developed for such purposes. For illustrative purposes, we adopt the parametric bootstrapping proposed by Henze (1996) to calculate the critical numbers or thresholds of the KS test. More specifically, for a given sample sizen, we generate 1,000 independent random samples from Poisson(λ= 10). For each sam...
work page 2023
-
[21]
against Poisson distribution (H1), given that the data come from a Poisson distribution. Our conclusion (see Theo- rem S3.1) confirms the limiting behavior ofD n displayed in Table S.1 of Section S2. We also obtain the limiting behavior ofD n for testing extended NB(µ, p) distributions (see Theorem S3.2) proposed in Section 4.2. GivenY 1, . . . , Yn that ...
work page 1999
-
[22]
For eachω∈Aand the givenϵ >0, there exists a constantN e(ω)>0, such that, supy |Fn(y)(ω)−F λ(y)| ≤ ϵ 3 for alln≥N e(ω). According to Theorem 5.13.1 in Fisz (1963),f NB(ν,p(ν))(k)→f λ(k) asν→ ∞for each k, wherep(ν) = ν ν+λ . Then,F NB(ν,p(ν))(k)→F λ(k) asν→ ∞, for eachk. For eachk= 0,1, . . . , M, there exists aV k >0 such that|F NB(ν,p(ν))(k)−F λ(k)|< ϵ 6...
work page 1963
-
[23]
Then for eachn≥N max(ω) = max{N0(ω), N1(ω), . . . , NM(ω), Ne(ω)}, supk<M |F ˆθ(ω)(k)− Fλ(k)| ≤ ϵ 6 and 0<1−F ˆθ(ω)(M)≤1−F λ(M) +|F ˆθ(ω)(M)−F λ(M)| ≤ ϵ 6 + ϵ 6 = ϵ 3, which leads toD n(ω)≤ϵ. Therefore,D n(ω) goes to zero asngoes to infinity.□ Theorem S3.1 theoretically justifies the vanishing phenomenon onD n observed in Table S.1. As a direct conclusion...
work page 1999
-
[24]
For eachω∈Aand the givenϵ >0, there exists a constantN e(ω)>0, such that, supy |Fn(y)(ω)−F λ(y)| ≤ ϵ 3 for alln≥N e(ω). According to Theorem 5.13.1 in Fisz (1963),f NB(ν,p(ν))(k)→f λ(k) asν→ ∞for each k, wherep(ν) = ν ν+λ . Then,F NB(ν,p(ν))(k)→F λ(k) asν→ ∞, for eachk. For eachk= 0,1, . . . , M, there exists aV k >0 such that|F NB(ν,p(ν))(k)−F λ(k)|< ϵ 6...
work page 1963
-
[25]
Spin light mode of massive neutrino radiative decay in matter
− →0, asngoes to infinity, where ˆν n(ω) = ¯Yn(ω)ˆpn(ω)/[1−ˆpn(ω)]. Then there exists anN k(ω) such that for alln≥N k(ω), if we still have ˆpn(ω)<1, then we must have ˆνn(ω)≥V max and|F NB(ˆνn(ω),ˆpn(ω))(k)−F λ(k)| ≤ ϵ 6; if ˆpn(ω) = 1, we still have Fˆλ= ¯Yn(ω)(k)−F λ(k) ≤ ϵ 6 . Then for eachn≥N max(ω) = max{N0(ω), N1(ω), . . . , NM(ω), Ne(ω)}, supk<M |F...
work page internal anchor Pith review Pith/arXiv arXiv
-
[26]
-2219.934 S25 Table S.3:Failure rate out of 100 samples for which algorithms fail to return an MLE for NB(ν, p), with sample sizen= 100 APMA AZIAD ν\p0.99 0.9 0.5 0.1 0.01 0.99 0.9 0.5 0.1 0.01 0.010 0 0 0 0 0 0 0 0 0 0.10 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 0 1000 0 0 0 0 0 0 0 0 0 MASS Nelder-Mead ν\p0.99 0.9 0.5 0.1 0.01 0.99 0.9...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.