pith. sign in

arxiv: 2605.17585 · v1 · pith:SBUAS2QZnew · submitted 2026-05-17 · 📊 stat.ME · math.ST· stat.TH

Modelling pairs of Poissons and binomials with negative correlation

Pith reviewed 2026-05-19 22:22 UTC · model grok-4.3

classification 📊 stat.ME math.STstat.TH
keywords bivariate poissonnegative correlationbinomial pairsadjustment functionsmeta-analysiscount dataplant competition
0
0 comments X

The pith

A multiplicative adjustment to independent marginal densities creates valid bivariate distributions for Poisson and binomial pairs that allow negative correlations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to start from any two given marginal distributions for count variables and build a joint distribution that matches those marginals exactly while permitting either positive or negative dependence. The joint density is formed by multiplying the product of the marginals by one plus a scalar parameter times the product of two bounded adjustment functions that each have mean zero under their marginal. Independence sits at an interior point of the allowable parameter range, so negative values of the parameter produce negative correlation without changing the marginal shapes. The construction is worked out in detail for Poisson marginals and then for binomial marginals, with concrete illustrations on plant competition counts and on a meta-analysis of alcohol screening questionnaire responses.

Core claim

Given marginal densities f1(x) and f2(y), the bivariate density f1(x)f2(y){1 + α h1(x)h2(y)} is valid over an interval of α that includes negative values whenever bounded zero-mean adjustment functions h1 and h2 can be chosen so the expression stays non-negative; this supplies bivariate Poisson and binomial models with negative correlation.

What carries the argument

The adjustment factor (1 + α h1(x)h2(y)) that perturbs the independence product while exactly preserving the prescribed marginal densities f1 and f2.

If this is right

  • Bivariate Poisson distributions can now be fitted with both positive and negative correlation while keeping the chosen marginal means and variances.
  • The plant competition dataset of 958 plots receives a more accurate analysis that captures negative dependence between seed and plant counts.
  • In meta-analyses of two-by-two tables, negative correlation between the number of correct yes and correct no answers can be modeled directly for the Audit-C questionnaire.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same adjustment construction could be tried on other discrete marginal families such as negative binomial or geometric.
  • Applied researchers facing trade-off counts in ecology or diagnostics could adopt the method to avoid forcing positive dependence.

Load-bearing premise

Bounded adjustment functions h1 and h2 with zero means under the marginals exist such that the full joint expression remains non-negative for some negative values of α.

What would settle it

For the Poisson marginals used in the plant data, every choice of bounded zero-mean h1 and h2 makes the joint density negative for all negative α.

Figures

Figures reproduced from arXiv: 2605.17585 by Nils Lid Hjort.

Figure 1
Figure 1. Figure 1: For the seeds and plats dataset, with n = 958 negatively correlated Poisson pairs: Left panel: Observed frequencies, for counts 0, 1, 2, 3, 4, 5-and-more, in full black and slanted red, along with their associated Poisson estimates, as dotted curves, for seeds (top curves) and plants (lower curves). The Poisson estimated rates are (θb1, θb2) = (1.700, 2.0.12). Right panel: Confidence curve for the tuning p… view at source ↗
Figure 2
Figure 2. Figure 2: Confidence curve for α, then for the correlation ρ, for the seeds-and-plants dataset. They are significantly negative, with 95 percent intervals [−1.046, −0.531] and [−0.077, −0.039] for respectively α and ρ. off for increasing t. In fact, a 96% interval is [0.39,∞), leaving it not implausible that t is very large. This in turn corresponds to the brutal adjustment function where g(x) is 1 at zero and 0 for… view at source ↗
Figure 3
Figure 3. Figure 3: Left panel: the m = 20 binomial pairs (xi, yi), with empirical correlation −0.325. Right panel: confi￾dence curve cc(α) for the adjustment parameter α, with point estimate −2.222; the 90 percent interval is [−3.137, −0.785]. If we choose thresholds x0, y0 close to the means n1p1 and n2p2, then the correlation is close to α/(2π), the c1, c2 above are close to 1 2 , 1 2 , which means that the range for the c… view at source ↗
Figure 4
Figure 4. Figure 4: For the mada Audit-C dataset, with correlated binomial pairs (xi, yi) across m = 10 studies: Left panel: the raw estimates pbi and qbi, for respectively true positives in the diseased group and true negatives in the non-diseased group. Right panel: confidence curve cc(α) for the dependence parameter, with point estimate −2.599 and 90 percent interval [−3.80, −0.63]; the allowed range for α does not go to t… view at source ↗
read the original abstract

Suppose $f_1(x)$ and $f_2(y)$ are given marginals for pairs $(x,y)$. I consider the construction $f_1(x)f_2(y)\{ 1+\alpha h_1(x)h_2(y) \}$, where $h_1$ and $h_2$ are seen as bounded adjustment functions, normalised to have means zero under $f_1$ and $f_2$. This defines a bivariate distribution for $(X,Y)$ with the specified marginal densities $f_1$ and $f_2$, with an interval of permissible values of $\alpha$, both positive and negative; in particular, independence corresponds to an innter point in the adjustments parameter region. Applications to bivariate Poisson distributions, allowing both positive and negative correlation, are discussed. As illustration I provide a more accurate and extended analysis of a Poisson pairs dataset, pertaining to competing seeds and plants, for $n=958$ plots of soil, earlier analysed in the well-cited paper Lakshminarayana, Pandit, Rao, Srinivasa (1999). The general apparatus is also shown to work for negatively correlated binomials. Those methods are illustrated in a meta-analysis framework for two-by-two tables across different studies, pertaining to the Audit-C screening questionnaire for alcohol use disorders, where again negative correlation is demonstrated, between $X$, the number of correct `yes', and $Y$, the number of correct `no'.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript proposes a construction for a bivariate distribution with given marginals f1(x) and f2(y) of the form f1(x)f2(y){1 + α h1(x)h2(y)}, where h1 and h2 are bounded adjustment functions normalized to have zero means under the marginals. This yields an interval of admissible α values that includes negatives, thereby allowing negative correlation while preserving the marginals exactly. The approach is specialized to Poisson and binomial marginals and illustrated on an ecological dataset of competing seeds/plants (n=958 plots) and on a meta-analysis of Audit-C two-by-two tables demonstrating negative correlation between correct 'yes' and 'no' counts.

Significance. If the construction and its non-negativity properties hold, the method supplies a simple, explicit mechanism for inducing negative dependence in pairs of count variables while keeping marginal distributions fixed. This addresses a recognized limitation of many standard bivariate Poisson constructions, which typically restrict correlation to non-negative values. The two empirical illustrations provide concrete evidence of applicability in ecology and health screening, and the explicit control over the sign of the covariance via sign(α) is a practical advantage.

major comments (2)
  1. [General construction] General construction (around the definition of the joint density): the claim that bounded zero-mean h1 and h2 guarantee an open interval of α containing negative values rests on the supremum of |h1 h2| being finite; the manuscript should state the resulting explicit upper bound on |α| (e.g., 1 / sup|h1 h2|) so that readers can verify the permissible range for the Poisson and binomial cases.
  2. [Poisson applications] Poisson application section: the covariance formula Cov(X,Y) = α E1[X h1(X)] E2[Y h2(Y)] is load-bearing for the negative-correlation claim; the paper must confirm that the chosen h functions (e.g., truncated versions) satisfy E[X h1(X)] ≠ 0, otherwise the construction yields only the independence case for that choice.
minor comments (3)
  1. [Abstract] Abstract contains the typo 'innter point' (should be 'inner point').
  2. [Data illustrations] The manuscript should supply the explicit functional forms of h1 and h2 actually used in the seed-competition and Audit-C analyses to permit direct replication.
  3. [General construction] Notation for the permissible interval of α could be clarified by writing the lower and upper bounds in terms of the marginal expectations rather than leaving them implicit.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and for the two specific suggestions that will improve the clarity of the manuscript. Both points are addressed below; we have revised the text accordingly.

read point-by-point responses
  1. Referee: [General construction] General construction (around the definition of the joint density): the claim that bounded zero-mean h1 and h2 guarantee an open interval of α containing negative values rests on the supremum of |h1 h2| being finite; the manuscript should state the resulting explicit upper bound on |α| (e.g., 1 / sup|h1 h2|) so that readers can verify the permissible range for the Poisson and binomial cases.

    Authors: We agree that an explicit statement of the bound improves readability. In the revised manuscript we now state that the joint remains non-negative for |α| < 1 / sup_{x,y} |h1(x)h2(y)| whenever the supremum is finite (which it is for the bounded h functions we employ). We have added the numerical value of this bound for both the Poisson and binomial specifications used in the applications. revision: yes

  2. Referee: [Poisson applications] Poisson application section: the covariance formula Cov(X,Y) = α E1[X h1(X)] E2[Y h2(Y)] is load-bearing for the negative-correlation claim; the paper must confirm that the chosen h functions (e.g., truncated versions) satisfy E[X h1(X)] ≠ 0, otherwise the construction yields only the independence case for that choice.

    Authors: We confirm that the chosen (truncated) h functions satisfy E[X h1(X)] ≠ 0 and E[Y h2(Y)] ≠ 0. Direct numerical evaluation under the fitted Poisson marginals yields non-zero values (approximately 0.87 and 1.12, respectively). This verification has been inserted into the revised Poisson section together with the explicit covariance formula. revision: yes

Circularity Check

0 steps flagged

No significant circularity; construction is self-contained

full rationale

The paper presents an explicit construction f(x,y) = f1(x)f2(y){1 + α h1(x)h2(y)} with h1, h2 bounded and normalized to zero mean under the given marginals. Marginal preservation follows immediately from the zero-mean condition by direct integration, and the interval of admissible α (including negatives) follows from boundedness ensuring non-negativity; these are definitional properties of the proposed family rather than derived claims that collapse back to inputs. Applications consist of fitting the construction to external datasets (Poisson seed/plant counts and Audit-C meta-analysis tables) with no load-bearing self-citations or uniqueness theorems invoked. The derivation chain is therefore independent and non-circular.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach depends on the mathematical existence of suitable bounded zero-mean adjustment functions and the non-negativity constraint for the density to be valid.

free parameters (1)
  • α
    The scalar parameter controlling dependence strength, selected within an interval to keep the joint non-negative.
axioms (1)
  • domain assumption The product f1(x)f2(y){1 + α h1(x)h2(y)} must remain non-negative for chosen α to define a valid probability distribution.
    Invoked to ensure the construction yields a proper bivariate density.

pith-pipeline@v0.9.0 · 5792 in / 1172 out tokens · 43877 ms · 2026-05-19T22:22:59.942246+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    and Ho, C

    Aitchison, J. and Ho, C. (1989). The multivariate Poisson-log-normal distribution.Biometrika 76, 643–653

  2. [2]

    (2013).Models and Inference for Correlated Count Data.PhD Dissertation, Department of Mathematics, University of Aarhus

    Andreassen, C.M. (2013).Models and Inference for Correlated Count Data.PhD Dissertation, Department of Mathematics, University of Aarhus

  3. [3]

    and Hjort, N.L

    Claeskens, G. and Hjort, N.L. (2008).Model Selection and Model Averaging.Cambridge University Press, Cambridge

  4. [4]

    (2025).mada: Meta-Analysis of Diagnostic Accuracy,Rpackage version 0.5.12, url isCRAN.R-project.org/package=mada

    Doebler, P. (2025).mada: Meta-Analysis of Diagnostic Accuracy,Rpackage version 0.5.12, url isCRAN.R-project.org/package=mada

  5. [5]

    and Gurland, J

    Edwards, C.B. and Gurland, J. (1961). A class of distributions applicable to accidents.Journal of the American Statistical Association56, 503–517

  6. [6]

    Hellton, K.H., Cummings, Vik-Mo, A.U., Nordrehaug, J.E., Aarsland, D., Selbaek, G., and Gill, L.M. (2020). The truth behind the zeros: A new approach to principal component analysis of the neuropsychiatric inventory.Multivariate Behavioral Research,56, 70–85

  7. [7]

    and Khasminskii, R.Z

    Hjort, N.L. and Khasminskii, R.Z. (1993). On the time a diffusion process spends a long a line. Stochastic Processes and their Applications47, 229–247

  8. [8]

    and Ntzoufras, I

    Karlis, D. and Ntzoufras, I. (2005). Bivariate Poisson and diagonal inflated bivariate Poisson regression models inR.Journal of Statistical Software,14, 1–36

  9. [9]

    A., Berner

    Kriston, L., H¨ olzel, L., Weiser. A., Berner. M., and H¨ arter, M. (2008). Meta-analysis: Are 3 questions enough to detect unhealthy alcohol use?Annals of Internal Medicine,149, 879–888

  10. [10]

    and Hjort, N.L

    Ko, V. and Hjort, N.L. (2019). Copula information criterion for model selection with two-stage maximum likelihood estimation. Econometrics and Statistics

  11. [11]

    and Hjort, N.L

    Ko, V. and Hjort, N.L. (2019). Model robust inference with two-stage maximum likelihood estimation for copulas. Journal of Multivariate Analysis, 171, 362–381

  12. [12]

    Ko, V., Hjort, N.L., and Hobæk Haff, I. (2019). Focused information criteria for copulae. Scandinavian Journal of Statistics, 46, 1117–1140

  13. [13]

    (1969).The Chi-Squared Distribution.Wiley, London

    Lancaster, H.O. (1969).The Chi-Squared Distribution.Wiley, London

  14. [14]

    Srinivasa (1999)

    Lakshminarayana, J., Pandit, S.N.N, and Rao, K. Srinivasa (1999). On a bivariate Poisson distribution.Communications in Statistics – Theory and methods28, 267–276

  15. [15]

    Mikosch, T. (2006). Copulas: tales and facts [with discussion and a rejoinder].Extremes,9, 3–20

  16. [16]

    Nelson, R. B. (1999).An Introduction to Copulas.Springer-Verlag, Berlin

  17. [17]

    and Hjort, N.L

    Schweder, T. and Hjort, N.L. (2016).Confidence, Likelihood, Probability.Cambridge University

  18. [18]

    Streitberg, B. (1990). Lancaster interactions revisited.Annals of Statistics,18, 1878–1885

  19. [19]

    Yu, J., Kepner, J.I., and Iyer, R. (2009). Exact tests using two correlated binomial variables in contemporary cancer clinical trials.Biometrical Journal,51, 899–914. 14