Composite likelihood inference for the Poisson log-normal model
Pith reviewed 2026-05-24 04:04 UTC · model grok-4.3
The pith
Composite likelihood inside the EM algorithm recovers the asymptotic efficiency of full maximum likelihood for the Poisson log-normal model.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors first present a Monte Carlo EM algorithm that targets maximum-likelihood estimators for the Poisson log-normal model but is computationally feasible only in low-dimensional latent spaces. They then introduce a novel procedure that embeds composite-likelihood and importance-sampling estimates inside the same EM framework. The resulting estimators retain the asymptotic properties of full maximum likelihood while bypassing the intractable high-dimensional integrals, thereby enabling reliable parameter estimation, confidence intervals, and hypothesis testing on data sets of moderate size.
What carries the argument
The composite-likelihood plus importance-sampling estimator that replaces the intractable conditional expectation inside the EM algorithm for the Poisson log-normal model.
If this is right
- The estimators retain the asymptotic efficiency and normality of maximum-likelihood estimators.
- Confidence intervals and hypothesis tests can be constructed directly from the composite-likelihood EM output.
- Computation remains feasible for data sets whose latent dimension would render ordinary Monte Carlo EM prohibitive.
- The method can be applied to identify significant covariates and residual correlations in multivariate count data.
Where Pith is reading between the lines
- The same substitution strategy could be tested on other latent-variable count models whose full conditional is intractable.
- Refinements to the importance-sampling proposal might further reduce Monte Carlo variance without altering the asymptotic guarantees.
- Likelihood-ratio statistics derived from the composite estimator could support model-comparison procedures that inherit the same large-sample justification.
Load-bearing premise
Replacing the intractable conditional expectation in the EM algorithm with a composite-likelihood plus importance-sampling estimator does not destroy the asymptotic equivalence to full maximum likelihood.
What would settle it
A Monte Carlo experiment in which the proposed estimators fail to achieve the same asymptotic variance as the true maximum-likelihood estimator, or in which the nominal coverage rate of the associated confidence intervals deviates systematically from the claimed level as sample size grows.
Figures
read the original abstract
The Poisson log-normal model is a latent variable model that provides a generic framework for the analysis of multivariate count data. Inferring its parameters can be a daunting task since the conditional distribution of the latent variables given the observed ones is intractable. For this model, variational approaches are the golden standard solution as they prove to be computationally efficient but lack theoretical guarantees on the estimates. Sampling-based solutions are quite the opposite. We first define a Monte Carlo EM algorithm that can achieve maximum likelihood estimators, but that is computationally efficient only for low-dimensional latent spaces. We then propose a novel inference procedure combining the EM framework with composite likelihood and importance sampling estimates. The algorithm preserves the desirable asymptotic properties of maximum likelihood estimators while circumventing the high-dimensional integration bottleneck, thus maintaining computational feasibility for moderately large datasets. This approach enables grounded parameter estimation, confidence intervals, and hypothesis testing. Application to the Barents Sea fish dataset demonstrates the algorithm capacity to identify significant environmental effects and residual interspecies correlations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a Monte Carlo EM algorithm for maximum likelihood estimation in the Poisson log-normal model for multivariate count data, then proposes a novel variant that replaces the intractable E-step conditional expectation with a composite-likelihood estimator combined with importance sampling. The central claim is that this approximation preserves the asymptotic properties of the full MLE while remaining computationally feasible for moderately high-dimensional latent spaces, enabling confidence intervals and hypothesis tests; the method is illustrated on the Barents Sea fish dataset.
Significance. If the asymptotic equivalence holds, the approach would supply a theoretically grounded alternative to variational methods for latent-variable count models, allowing valid inference on environmental effects and residual correlations in moderately large datasets where full MCEM is intractable.
major comments (2)
- [Abstract] Abstract (paragraph on the novel inference procedure): the assertion that the composite-likelihood + importance-sampling replacement for the E-step 'preserves the desirable asymptotic properties of maximum likelihood estimators' is presented without any derivation, bias/variance analysis, or rate conditions showing that the Godambe information mismatch and Monte Carlo error vanish sufficiently fast for the overall estimator to remain asymptotically equivalent to exact MCEM or direct MLE. This is the load-bearing step for the central claim.
- No section supplies simulation evidence or theoretical bounds demonstrating that the fixed point and asymptotic distribution of the proposed EM sequence coincide with those of the exact MLE; the abstract's claim therefore cannot be assessed from the supplied text.
minor comments (1)
- [Abstract] The abstract refers to 'grounded parameter estimation, confidence intervals, and hypothesis testing' but does not indicate which information matrix (observed, expected, or sandwich) is used for standard errors.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for providing constructive feedback. We address the major comments point by point below.
read point-by-point responses
-
Referee: [Abstract] Abstract (paragraph on the novel inference procedure): the assertion that the composite-likelihood + importance-sampling replacement for the E-step 'preserves the desirable asymptotic properties of maximum likelihood estimators' is presented without any derivation, bias/variance analysis, or rate conditions showing that the Godambe information mismatch and Monte Carlo error vanish sufficiently fast for the overall estimator to remain asymptotically equivalent to exact MCEM or direct MLE. This is the load-bearing step for the central claim.
Authors: We acknowledge that the manuscript does not provide a detailed derivation or analysis of the asymptotic properties in the abstract or main text. The claim is motivated by the asymptotic theory of composite likelihood estimators, which are known to be consistent and asymptotically normal under regularity conditions, combined with the convergence of importance sampling. However, to address the referee's concern, we will revise the paper to include a new theoretical subsection that derives the conditions under which the proposed estimator is asymptotically equivalent to the MLE, including analysis of the Godambe information and Monte Carlo error rates. revision: yes
-
Referee: [—] No section supplies simulation evidence or theoretical bounds demonstrating that the fixed point and asymptotic distribution of the proposed EM sequence coincide with those of the exact MLE; the abstract's claim therefore cannot be assessed from the supplied text.
Authors: The referee is correct that the current manuscript lacks both simulation evidence and explicit theoretical bounds comparing the proposed method to exact MCEM. We agree this is necessary to support the central claim. In the revised version, we will add simulation studies on low-dimensional cases where exact MCEM is feasible to verify that the fixed points and asymptotic distributions coincide, as well as theoretical bounds on the approximation error. revision: yes
Circularity Check
No significant circularity; derivation is a new algorithmic construction
full rationale
The paper defines a Monte Carlo EM algorithm and then a composite-likelihood + importance-sampling variant. The central claim that the procedure 'preserves the desirable asymptotic properties of maximum likelihood estimators' is asserted as a consequence of the construction rather than shown to reduce to a fitted parameter or self-citation by any equation. No self-definitional loops, fitted-input-as-prediction, or load-bearing self-citations appear in the supplied text. The derivation chain is therefore self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
novel inference procedure combining the EM framework with composite likelihood and importance sampling estimates. The algorithm preserves the desirable asymptotic properties of maximum likelihood estimators
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CL-EM algorithm … yields a sequence such that cℓ_θ(h+1)(Y) ≥ cℓ_θ(h)(Y)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
S. Agapiou, O. Papaspiliopoulos, D. Sanz-Alonso, and A. M. Stuart. Importance Sampling: Intrinsic Dimension and Computational Cost . Statistical Science, 32 0 (3): 0 405--431, 2017
work page 2017
-
[2]
J. Aitchison and C. Ho. The multivariate P oisson-log normal distribution. Biometrika, 76 0 (4): 0 643--653, 1989
work page 1989
-
[3]
F. Bartolucci, F. Chiaromonte, P. Kuruppumullage Don, and B. Lindsay. Composite likelihood inference in a discrete latent variable model for two-way ``clustering-by-segmentation'' problems. Journal of Computational and Graphical Statistics, 26 0 (2): 0 388--402, 2017
work page 2017
-
[4]
J. Besag. Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society: Series B (Methodological), 36 0 (2): 0 192--225, 1974
work page 1974
-
[5]
M. Bevilacqua and C. Gaetan. Comparing composite likelihood methods based on pairs for spatial gaussian random fields. Statistics and Computing, 25: 0 877--892, 2015
work page 2015
-
[6]
D. M. Blei , A. Kucukelbir , and J. D. McAuliffe . Variational inference: A review for statisticians . Journal of the American Statistical Association, 112 0 (518): 0 859--877, 2017
work page 2017
-
[7]
J. G. Booth and J. P. Hobert. Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm . Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61 0 (1): 0 265--285, 1999
work page 1999
-
[8]
R. A. Boyles . On the Convergence of the EM Algorithm . Journal of the Royal Statistical Society: Series B (Methodological), 45 0 (1): 0 47--50, 1983
work page 1983
-
[9]
O. Capp \'e , R. Douc, A. Guillin, J.-M. Marin, and C. P. Robert. Adaptive importance sampling in general mixture classes. Statistics and Computing, 18 0 (4): 0 447--459, 2008
work page 2008
-
[10]
S. Chatterjee and P. Diaconis. The sample size required in importance sampling . The Annals of Applied Probability, 28 0 (2): 0 1099--1135, 2018
work page 2018
-
[11]
J. Chiquet, M. Mariadassou, and S. Robin. Variational inference for probabilistic Poisson PCA . The Annals of Applied Statistics, 12 0 (4): 0 2674--2698, 2018
work page 2018
-
[12]
J. Chiquet, M. Mariadassou, and S. Robin. Variational inference for sparse network reconstruction from count data. In K. Chaudhuri and R. Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 1162--1171, California, USA, 09--15 Jun 2019. PMLR
work page 2019
-
[13]
J. Chiquet, M. Mariadassou, and S. Robin. The P oisson-Lognormal Model as a Versatile Framework for the Joint Analysis of Species Abundances . Frontiers in Ecology and Evolution, 9: 0 188, 2021
work page 2021
-
[14]
J.-M. Cornuet, J.-M. Marin, A. Mira, and C. P. Robert. Adaptive multiple importance sampling. Scandinavian Journal of Statistics, 39 0 (4): 0 798--812, 2012
work page 2012
- [15]
- [16]
-
[17]
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum Likelihood from Incomplete Data via the EM algorithm . Journal of the Royal Statistical Society: Series B, 39: 0 1--38, 1977
work page 1977
-
[18]
B. Efron and C. Stein. The jackknife estimate of variance. The Annals of Statistics, pages 586--596, 1981
work page 1981
-
[19]
J. Eidsvik, B. A. Shaby, B. J. Reich, M. Wheeler, and J. Niemi. Estimation and prediction in spatial models with block composite likelihoods. Journal of Computational and Graphical Statistics, 23 0 (2): 0 295--315, 2014
work page 2014
-
[20]
G. Fort and E. Moulines. Convergence of the Monte Carlo expectation maximization for curved exponential families . The Annals of Statistics, 31 0 (4): 0 1220 -- 1259, 2003
work page 2003
-
[21]
M. Fossheim, E. M. Nilssen, and M. Aschan. Fish assemblages in the B arents S ea. Marine Biology Research, 2 0 (4): 0 260--269, 2006
work page 2006
- [22]
- [23]
-
[24]
T. S. Jaakkola and M. I. Jordan. B ayesian parameter estimation via variational methods. Statistics and Computing, 10 0 (1): 0 25--37, 2000
work page 2000
-
[25]
A. Korba and F. Portier. Adaptive Importance Sampling meets Mirror Descent : a Bias-variance Tradeoff . In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151 of Proceedings of Machine Learning Research, pages 11503--11527, Virtual Conference, 2022. PMLR
work page 2022
-
[26]
R. A. Levine and G. Casella. Implementations of the Monte Carlo EM Algorithm . Journal of Computational and Graphical Statistics, 10 0 (3): 0 422--439, 2001
work page 2001
-
[27]
B. Lindsay. Composite likelihood methods. Contemporary mathematics, 80 0 (1): 0 221--239, 1988
work page 1988
-
[28]
T. A. Louis. Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society. Series B, pages 226--233, 1982
work page 1982
-
[29]
C. E. McCulloch. Maximum Likelihood Algorithms for Generalized Linear Mixed Models . Journal of the American Statistical Association, 92 0 (437): 0 162--170, 1997
work page 1997
-
[30]
L. Middleton, G. Deligiannidis, A. Doucet, and P. E. Jacob. Unbiased Smoothing using Particle Independent Metropolis-Hastings . In K. Chaudhuri and M. Sugiyama, editors, Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 2378--2387, Naha, Okinawa...
work page 2019
-
[31]
R: A Language and Environment for Statistical Computing
R Core Team . R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2015. URL https://www.R-project.org/
work page 2015
-
[32]
A. W. v. d. Vaart. Asymptotic statistics, volume 27 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, UK, 1998
work page 1998
-
[33]
C. Varin and P. Vidoni. A note on composite likelihood inference and model selection. Biometrika, 92 0 (3): 0 519--528, 2005
work page 2005
- [34]
- [35]
-
[36]
M. J. Wainwright and M. I. Jordan. Graphical Models, Exponential Families, and Variational Inference . Found. Trends Mach. Learn., 1 0 (1--2): 0 1--305, 2008
work page 2008
-
[37]
G. C. G. Wei and M. A. Tanner. A M onte C arlo I mplementation of the EM A lgorithm and the P oor M an's D ata A ugmentation A lgorithms. Journal of the American Statistical Association, 85 0 (411): 0 699--704, 1990
work page 1990
-
[38]
C. Wu. On the convergence properties of the EM algorithm. The Annals of Statistics, 11 0 (1): 0 95--103, 1983
work page 1983
-
[39]
X. Xu, N. Reid, and L. Xu. Note on information bias and efficiency of composite likelihood. Technical Report 1612.06967, arXiv, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[40]
Y. Zhao and H. Joe. Composite likelihood estimation in multivariate data analysis. Canadian Journal of Statistics, 33 0 (3): 0 335--356, 2005
work page 2005
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.