pith. sign in

arxiv: 2404.07457 · v4 · submitted 2024-04-11 · 🧮 math.ST · stat.CO· stat.TH

From Poisson Observations to Fitted Negative Binomial Distribution

Pith reviewed 2026-05-24 02:39 UTC · model grok-4.3

classification 🧮 math.ST stat.COstat.TH
keywords negative binomial distributionPoisson distributionmaximum likelihood estimationparameter consistencycount dataoverdispersionparameterization
0
0 comments X

The pith

When data is generated by a Poisson process, a newly parameterized negative binomial distribution recovers the true Poisson parameters consistently.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a new algorithm for computing maximum likelihood estimates of negative binomial parameters that is more efficient and accurate. It extends the negative binomial family with a new parameterization that includes the Poisson distribution as a special case. Theoretical results show that fitting this extended model to Poisson data yields parameters that consistently recover the original Poisson distribution. This addresses the problem that standard negative binomial fits become unstable or fail when the data is actually Poisson. A sympathetic reader would care because many count data sets are close to Poisson but modelers want the flexibility of negative binomial without numerical breakdowns.

Core claim

The authors introduce an extended negative binomial distribution via a new parameterization that encompasses the Poisson distribution as a limit case. They provide a new maximum likelihood estimation algorithm and prove that, under Poisson observations, the estimated parameters of this extended distribution converge to those of the true Poisson distribution.

What carries the argument

The extended negative binomial distribution with a new parameterization that includes the Poisson as a special class, together with the associated maximum likelihood estimation algorithm.

Load-bearing premise

The new parameterization of the negative binomial includes the Poisson distribution as a special limit case and the maximum likelihood algorithm converges when the data is exactly Poisson.

What would settle it

Simulate data from a Poisson distribution with a known rate, apply the new estimation procedure, and check whether the fitted negative binomial parameters approach the Poisson values as sample size grows.

read the original abstract

The negative binomial distribution has been widely used as a more flexible model than the Poisson distribution for count data. However, when the true data-generating process is Poisson, it is often challenging to distinguish it from a negative binomial distribution with extreme parameter values, and existing maximum likelihood estimation procedures for the negative binomial distribution may fail or produce unstable estimates. To address this issue, we develop a new algorithm for computing the maximum likelihood estimate of negative binomial parameters, which is more efficient and more accurate than existing methods. We further extend negative binomial distributions with a new parameterization to cover Poisson distributions as a special class. We provide theoretical justifications showing that, when applied to a Poisson data, the estimated parameters of the extended negative binomial distribution can consistently recover the true Poisson distribution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript develops a new algorithm for computing the maximum likelihood estimates of negative binomial parameters, claimed to be more efficient and accurate than existing methods. It introduces an extended parameterization of the negative binomial distribution that includes the Poisson distribution as a special (limit) case. Theoretical justifications are provided to show that, for Poisson-generated data, the fitted parameters of this extended model consistently recover the true Poisson distribution.

Significance. If the consistency result holds, the work provides a unified modeling framework that avoids instability in negative binomial fits to Poisson-like count data, which is a frequent practical issue. The new MLE algorithm and boundary-inclusive parameterization could improve robustness in statistical applications involving count data.

major comments (1)
  1. [Abstract] Abstract: The central consistency claim requires that the extended negative binomial parameterization includes the Poisson as the limit when the dispersion parameter tends to infinity. Standard MLE consistency theorems assume an interior point with positive definite Fisher information; the manuscript must supply an explicit boundary-case analysis (e.g., showing the dispersion estimator diverges to infinity in probability while the mean estimator converges to the true Poisson rate) rather than invoking interior-point results.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and for highlighting the need for a rigorous boundary analysis in the consistency proof. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central consistency claim requires that the extended negative binomial parameterization includes the Poisson as the limit when the dispersion parameter tends to infinity. Standard MLE consistency theorems assume an interior point with positive definite Fisher information; the manuscript must supply an explicit boundary-case analysis (e.g., showing the dispersion estimator diverges to infinity in probability while the mean estimator converges to the true Poisson rate) rather than invoking interior-point results.

    Authors: We agree that the consistency result for the boundary case (dispersion parameter tending to infinity) requires an explicit analysis beyond standard interior-point MLE theorems. In the revised manuscript we will add a dedicated subsection in the theoretical results section that establishes the required boundary behavior: under Poisson-generated data, the MLE of the dispersion parameter diverges to infinity in probability while the mean-parameter estimator converges in probability to the true Poisson rate. This will be proved directly from the extended parameterization and the form of the log-likelihood, without relying on interior-point assumptions. revision: yes

Circularity Check

0 steps flagged

No circularity; consistency claim offered as independent theoretical result

full rationale

The abstract describes an extension of the negative binomial parameterization that includes the Poisson distribution as a special (limit) case, followed by a separate claim of providing theoretical justifications for consistent MLE recovery of the true Poisson parameters on Poisson data. No quoted equations, self-citations, or steps in the given material reduce this consistency result to a fitted input by construction, a renamed known pattern, or a load-bearing self-citation chain. The derivation is presented as self-contained, with the theoretical justification treated as additional content rather than tautological with the parameterization choice itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Only the abstract is available, so the ledger is necessarily incomplete. The central claim rests on standard regularity conditions for MLE consistency (domain_assumption) and the modeling choice that the extended parameterization covers Poisson (ad_hoc_to_paper). No free parameters or invented entities are explicitly named beyond the usual NB parameters.

axioms (1)
  • domain assumption Standard regularity conditions hold for the MLE to be consistent under the extended parameterization when data is Poisson.
    Invoked implicitly by the consistency claim in the abstract.
invented entities (1)
  • Extended negative binomial parameterization no independent evidence
    purpose: To include Poisson distributions as a special class.
    Introduced in the abstract to address the fitting instability; no independent evidence provided beyond the consistency statement.

pith-pipeline@v0.9.0 · 5659 in / 1318 out tokens · 18708 ms · 2026-05-24T02:39:07.134058+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

  1. [1]

    Abramowitz, M. and I.A. Stegun. 1970.Handbook of Mathematical Functions with For- mulas, Graphs, and Mathematical Tables(9th ed.). Dover Publications: New York. Aldirawi, H. and J. Yang

  2. [2]

    Apostol, T.M

    Sampling theory of the negative binomial and logarithmic series distributions.Biometrika 37(3/4): 358–382 . Apostol, T.M. 1967.Calculus, Volume I(2 ed.). John Wiley & Sons. Bandara, U., R. Gill, and R. Mitra

  3. [3]

    A limited memory algorithm for bound constrained optimization.SIAM Journal on Scientific Computing 16(5): 1190–1208 . Cox, D. and E. Snell. 1989.Analysis of Binary Data(2 ed.). Chapman & Hall/CRC. de Jong, P. and G.Z. Heller. 2008.Generalized Linear Models for Insurance Data. Cam- bridge University Press. Delignette-Muller, M.L. and C. Dutang

  4. [4]

    Negative binomial distribution.Annals of Eugenics11: 182–187 . Fisz, M. 1963.Probability Theory and Mathematical Statistics(3th ed.). Wiley, New York. Forbes, C., M. Evans, N. Hastings, and B. Peacock. 2011.Statistical Distributions(4 ed.). John Wiley & Sons. Harvey, A.C. and J. Durbin

  5. [5]

    Canadian Journal of Statistics 24(1): 81–93

    Empirical-distribution-function goodness-of-fit tests for discrete models. Canadian Journal of Statistics 24(1): 81–93 . Jackman, S., A. Tahk, A. Zeileis, C. Maimone, J. Fearon, and Z. Meers 2024.pscl: Political Science Computational Laboratory. R package version 1.5.9. Johnson, N.L., S. Kotz, and A.W. Kemp. 2005.Univariate Discrete Distributions(3rd ed.)...

  6. [6]

    Pielou, E.C

    On invariance and maximum likelihood estimation.The American Statistician 46(3): 209–212 . Pielou, E.C. 1977.Mathematical Ecology(2 ed.). John Wiley & Sons: New York. R Core Team 2025.R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Version 4.5.0. Resnick, S.I. 1999.A Probability Path. Birkh...

  7. [7]

    Simonsen, W

    On the solution of a maximum-likelihood equation of the negative binomial distribution.Scandinavian Actuarial Journal 1976(4): 220–231 . Simonsen, W

  8. [8]

    On the solution of a maximum-likelihood equation of the negative binomial distribution

    Correction to “On the solution of a maximum-likelihood equation of the negative binomial distribution”.Scandinavian Actuarial Journal 1980(1): 41–42 . Thall, P.F. and S.C. Vail

  9. [9]

    Venables, W.N

    Some covariance models for longitudinal count data with over-dispersion.Biometrics46: 657–671 . Venables, W.N. and B.D. Ripley. 2002.Modern Applied Statistics with S(4 ed.). New York: Springer. Venzon, D. and S. Moolgavkar

  10. [10]

    Wolny-Dominiak, A

    Estimation problems for the two-parameter negative binomial distribu- tion.Statistics & Probability Letters 26(2): 113–114 . Wolny-Dominiak, A. and M. Trzesiok 2014.insuranceData: A Collection of Insurance Datasets Useful in Risk Classification in Non-life Insurance. R package version 1.0. Yu, Z., N. Dousti Mousavi, and J. Yang

  11. [11]

    2 +· · ·+ 1 (ν+y−1) 2 + ν ¯Yn ν+ ¯Yn , and thus lim ν→0+ ν2g′(ν) =− 1 n P y∈I\{0} fy = f0 n −1<0, then lim ν→0+ g′(ν) =−∞. S1 On the other hand, asν→ ∞, by applying Taylor’s theorem with the Peano form of the reminder (see, e.g., Section 7.9 in Apostol (1967)), we have 1 ν+k = 1 ν 1 + k ν = 1 ν − k ν2 +o(ν −2) log 1 + ¯Yn ν = ¯Yn ν − ¯Y 2 n 2ν2 +o(ν −2) 1...

  12. [12]

    With the constraintν∈(0, ν max], we must haveν ∗ >0

    According to Lemma 2.1, lim ν→0+ g(ν) =∞. With the constraintν∈(0, ν max], we must haveν ∗ >0. S2 We claim thatg(ν ∗)>0. Actually, ifg(ν ∗)<0, then there exists aν ∗∗ ∈(0, ν ∗), such that,g(ν ∗∗) = 0, which violates min ν∈(0,νmax] g2(ν)>0. The factg(ν ∗)>0 impliesg(ν)≥g(ν ∗)>0 for allν∈(0, ν max]. That is,h(ν) increases all the time and thus the MLE ofνis...

  13. [13]

    Then ∞X y=M+1 yfy n = ∞X y=M+1 #{1≤i≤n|Y i =y} ·y n = 1 n nX i=1 Yi1{Yi>M} − →E(Y 11{Y1>M}) almost surely, asn→ ∞

    =µ <∞. Then ∞X y=M+1 yfy n = ∞X y=M+1 #{1≤i≤n|Y i =y} ·y n = 1 n nX i=1 Yi1{Yi>M} − →E(Y 11{Y1>M}) almost surely, asn→ ∞. Then there exists an eventAwith probability 1, such that, for eachω∈A, fy(ω) n = #{1≤i≤n|Y i(ω) =y} n − →P(Y1 =y) (S.1) for eachy∈ {0,1,2, . . .}, ¯Yn(ω)→µand thus log 1 + ¯Yn(ω) ν →log 1 + µ ν , and ∞X y=M+1 yfy(ω) n →E(Y 11{Y1>M}) (S...

  14. [14]

    Then according to (10), GN B(ν,p)(ν) =EΨ(ν+Y 1)−Ψ(ν)−log 1 + µ ν = Ψ(ν)−log(p)−Ψ(ν)−log 1 + 1−p p = 0, S5 whereµ=ν(1−p)/pfor NB(ν, p)

    = Ψ(ν)−log(p), ifY 1 ∼NB(ν, p) (please note that the parameters in their paper are different from ours, withpreplaced with 1−p). Then according to (10), GN B(ν,p)(ν) =EΨ(ν+Y 1)−Ψ(ν)−log 1 + µ ν = Ψ(ν)−log(p)−Ψ(ν)−log 1 + 1−p p = 0, S5 whereµ=ν(1−p)/pfor NB(ν, p). In other words, GN B(ν,p)(ν) = ∞X y=0 1−F N B(ν,p)(y) ν+y −log 1 + µ ν = 0, whereF N B(ν,p)(y...

  15. [15]

    On the other hand, according to Lemma 1 of Minc and Sathre (1964) or Alzer (1997), 0<log Γ(x)− (x− 1

  16. [16]

    In this case,µ=ν(1−p)/pbuilds up a one-to-one correspondence from{(ν, p)|ν∈ (0, νmax], p∈(0,1)}to{(ν, µ)|ν∈(0, ν max], µ∈(0,∞)}, which will pass along the S9 uniqueness of MLE

    Recall the invariance property of MLE (see, e.g., Zehna (1966) or Pal and Berry (1992)), that is, given a functionτand an MLE ˆθofθ,τ( ˆθ) is always an MLE ofτ(θ). In this case,µ=ν(1−p)/pbuilds up a one-to-one correspondence from{(ν, p)|ν∈ (0, νmax], p∈(0,1)}to{(ν, µ)|ν∈(0, ν max], µ∈(0,∞)}, which will pass along the S9 uniqueness of MLE. According to the...

  17. [17]

    It supports the relevant statement in Section

    Then for eachω∈A ∞, ˆp→1, asn→ ∞.□ S11 S2 A simulation study on using KS test for NB against Poisson distributions In this section, we use the Kolmogorov-Smirnov (KS) test (Massey Jr, 1951), a classical goodness-of-fit procedure, to illustrate that it is difficult to separate a Poisson law from a nearly equidispersed NB distribution. It supports the relev...

  18. [18]

    , Yn}originates from a continuous cumulative distribution function (CDF)F θ(y), char- acterized by specific model parameter(s)θ

    The original KS test was proposed to assess whether a given random sample{Y 1, . . . , Yn}originates from a continuous cumulative distribution function (CDF)F θ(y), char- acterized by specific model parameter(s)θ. The KS test statistic is defined asD n = supy |Fn(y)−F θ(y)|, whereF n(y) =n −1Pn i=1 1(−∞,y](Yi) represents the empirical distri- bution funct...

  19. [19]

    suggested replacingθwith its estimate ˆθwhen calculatingD n . Adjustedp-values or thresholds for test statistics can be estimated by parametric bootstrapping (Henze, 1996), nonpara- metric bootstrapping (Aldirawi et al., 2019), or nested bootstrapping (Dousti Mousavi et al.,

  20. [20]

    For illustrative purposes, we adopt the parametric bootstrapping proposed by Henze (1996) to calculate the critical numbers or thresholds of the KS test

    and AZIAD (Dousti Mousavi et al., 2023), have been developed for such purposes. For illustrative purposes, we adopt the parametric bootstrapping proposed by Henze (1996) to calculate the critical numbers or thresholds of the KS test. More specifically, for a given sample sizen, we generate 1,000 independent random samples from Poisson(λ= 10). For each sam...

  21. [21]

    Our conclusion (see Theo- rem S3.1) confirms the limiting behavior ofD n displayed in Table S.1 of Section S2

    against Poisson distribution (H1), given that the data come from a Poisson distribution. Our conclusion (see Theo- rem S3.1) confirms the limiting behavior ofD n displayed in Table S.1 of Section S2. We also obtain the limiting behavior ofD n for testing extended NB(µ, p) distributions (see Theorem S3.2) proposed in Section 4.2. GivenY 1, . . . , Yn that ...

  22. [22]

    According to Theorem 5.13.1 in Fisz (1963),f NB(ν,p(ν))(k)→f λ(k) asν→ ∞for each k, wherep(ν) = ν ν+λ

    For eachω∈Aand the givenϵ >0, there exists a constantN e(ω)>0, such that, supy |Fn(y)(ω)−F λ(y)| ≤ ϵ 3 for alln≥N e(ω). According to Theorem 5.13.1 in Fisz (1963),f NB(ν,p(ν))(k)→f λ(k) asν→ ∞for each k, wherep(ν) = ν ν+λ . Then,F NB(ν,p(ν))(k)→F λ(k) asν→ ∞, for eachk. For eachk= 0,1, . . . , M, there exists aV k >0 such that|F NB(ν,p(ν))(k)−F λ(k)|< ϵ 6...

  23. [23]

    , NM(ω), Ne(ω)}, supk<M |F ˆθ(ω)(k)− Fλ(k)| ≤ ϵ 6 and 0<1−F ˆθ(ω)(M)≤1−F λ(M) +|F ˆθ(ω)(M)−F λ(M)| ≤ ϵ 6 + ϵ 6 = ϵ 3, which leads toD n(ω)≤ϵ

    Then for eachn≥N max(ω) = max{N0(ω), N1(ω), . . . , NM(ω), Ne(ω)}, supk<M |F ˆθ(ω)(k)− Fλ(k)| ≤ ϵ 6 and 0<1−F ˆθ(ω)(M)≤1−F λ(M) +|F ˆθ(ω)(M)−F λ(M)| ≤ ϵ 6 + ϵ 6 = ϵ 3, which leads toD n(ω)≤ϵ. Therefore,D n(ω) goes to zero asngoes to infinity.□ Theorem S3.1 theoretically justifies the vanishing phenomenon onD n observed in Table S.1. As a direct conclusion...

  24. [24]

    According to Theorem 5.13.1 in Fisz (1963),f NB(ν,p(ν))(k)→f λ(k) asν→ ∞for each k, wherep(ν) = ν ν+λ

    For eachω∈Aand the givenϵ >0, there exists a constantN e(ω)>0, such that, supy |Fn(y)(ω)−F λ(y)| ≤ ϵ 3 for alln≥N e(ω). According to Theorem 5.13.1 in Fisz (1963),f NB(ν,p(ν))(k)→f λ(k) asν→ ∞for each k, wherep(ν) = ν ν+λ . Then,F NB(ν,p(ν))(k)→F λ(k) asν→ ∞, for eachk. For eachk= 0,1, . . . , M, there exists aV k >0 such that|F NB(ν,p(ν))(k)−F λ(k)|< ϵ 6...

  25. [25]

    Spin light mode of massive neutrino radiative decay in matter

    − →0, asngoes to infinity, where ˆν n(ω) = ¯Yn(ω)ˆpn(ω)/[1−ˆpn(ω)]. Then there exists anN k(ω) such that for alln≥N k(ω), if we still have ˆpn(ω)<1, then we must have ˆνn(ω)≥V max and|F NB(ˆνn(ω),ˆpn(ω))(k)−F λ(k)| ≤ ϵ 6; if ˆpn(ω) = 1, we still have Fˆλ= ¯Yn(ω)(k)−F λ(k) ≤ ϵ 6 . Then for eachn≥N max(ω) = max{N0(ω), N1(ω), . . . , NM(ω), Ne(ω)}, supk<M |F...

  26. [26]

    -2219.934 S25 Table S.3:Failure rate out of 100 samples for which algorithms fail to return an MLE for NB(ν, p), with sample sizen= 100 APMA AZIAD ν\p0.99 0.9 0.5 0.1 0.01 0.99 0.9 0.5 0.1 0.01 0.010 0 0 0 0 0 0 0 0 0 0.10 0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 0 0 0 0 100 0 0 0 0 0 0 0 0 0 1000 0 0 0 0 0 0 0 0 0 MASS Nelder-Mead ν\p0.99 0.9 0.5 0.1 0.01 0.99 0.9...