Posterior Contraction Rates for Gaussian Cox Processes with Non-identically Distributed Data

David S. Leslie; James A. Grant

arxiv: 1906.08799 · v2 · pith:D3SMZ644new · submitted 2019-06-20 · 🧮 math.ST · stat.TH

Posterior Contraction Rates for Gaussian Cox Processes with Non-identically Distributed Data

James A. Grant , David S. Leslie This is my paper

Pith reviewed 2026-05-25 18:56 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords posterior contraction ratesGaussian Cox processesnon-identically distributed datanon-homogeneous Poisson processessigmoidal Gaussian Cox processquadratic Gaussian Cox processBayesian nonparametric inference

0 comments

The pith

Gaussian Cox process models yield posterior contraction rates for non-identically distributed Poisson process data at finite sample sizes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that Bayesian posterior estimates for the rate function in non-homogeneous Poisson processes contract around the true value at explicit rates when the observations are non-identically distributed. It studies this for the Sigmoidal Gaussian Cox Process, where the rate is a logistic transformation of a Gaussian process, and the Quadratic Gaussian Cox Process, where it is a quadratic transformation. The contraction rates are derived for both the concentration of posterior mass and the size of the balls containing it, and these hold for certain finite numbers of observations rather than only in the limit. This matters because many real datasets involve observations made under varying conditions that prevent identical distribution, yet reliable uncertainty quantification is still needed.

Core claim

For the Sigmoidal Gaussian Cox Process and the Quadratic Gaussian Cox Process, the posterior distribution on the rate function λ contracts at certain rates to the true λ even when the data consist of non-identically distributed realizations from Poisson processes with intensities that are transformations of λ, and these rates apply for finite numbers of observations with appropriate hyperparameter choices.

What carries the argument

The Sigmoidal Gaussian Cox Process and Quadratic Gaussian Cox Process priors on the rate function λ of a non-homogeneous Poisson process, which allow derivation of explicit posterior contraction rates under non-iid sampling.

If this is right

The width of the balls containing most posterior mass shrinks at the derived rates.
Posterior mass outside these balls also shrinks at the derived rates.
The results apply directly to finite sample sizes for specific prior choices.
Non-identical distributions are handled without requiring identical distribution assumptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar contraction results might be obtainable for other smooth transformations of Gaussian processes beyond logistic and quadratic.
The finite-sample nature of the results could allow direct comparison with simulation studies for small datasets.
This framework might extend to other point process models where detectability varies across observations.

Load-bearing premise

The observed events come from non-identically distributed Poisson processes whose intensity functions are obtained by applying a fixed transformation to a common underlying rate function λ.

What would settle it

A simulation study with a known true λ, generating a finite number of non-iid Poisson process realizations, and checking whether the posterior mass lies outside the predicted contraction ball with probability not going to zero would falsify the rate claim.

read the original abstract

This paper considers the posterior contraction of non-parametric Bayesian inference on non-homogeneous Poisson processes. We consider the quality of inference on a rate function $\lambda$, given non-identically distributed realisations, whose rates are transformations of $\lambda$. Such data arises frequently in practice due, for instance, to the challenges of making observations with limited resources or the effects of weather on detectability of events. We derive contraction rates for the posterior estimates arising from the Sigmoidal Gaussian Cox Process and Quadratic Gaussian Cox Process models. These are popular models where $\lambda$ is modelled as a logistic and quadratic transformation of a Gaussian Process respectively. Our work extends beyond existing analyses in several regards. Firstly, we consider non-identically distributed data, previously unstudied in the Poisson process setting. Secondly, we consider the Quadratic Gaussian Cox Process model, of which there was previously little theoretical understanding. Thirdly, we provide rates on the shrinkage of both the width of balls around the true $\lambda$ in which the posterior mass is concentrated and on the shrinkage of posterior mass outside these balls - usually only the former is explicitly given. Finally, our results hold for certain finite numbers of observations, rather than only asymptotically, and we relate particular choices of hyperparameter/prior to these results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper derives posterior contraction rates for nonparametric Bayesian inference on the intensity function λ of non-homogeneous Poisson processes under non-identically distributed observations. It treats the Sigmoidal Gaussian Cox Process (logistic link) and Quadratic Gaussian Cox Process (quadratic link) models, providing explicit rates both for the radius of balls containing most posterior mass around the true λ and for the decay of posterior mass outside those balls. The results are stated to hold for finite sample sizes under suitable hyperparameter choices rather than only in the large-n limit, extending prior work on iid Poisson process data.

Significance. If the finite-n claims are established without implicit large-n thresholds, the work would meaningfully extend the literature by handling non-iid data (common in applications with limited resources or detectability effects), by supplying the first substantial theory for the quadratic model, and by giving both concentration and tail bounds. The explicit linkage of hyperparameter choices to finite-n validity is a further strength.

major comments (2)

[Abstract] Abstract: the central claim that contraction rates hold for certain finite numbers of observations (rather than only asymptotically) is load-bearing for the paper's novelty. Standard nonparametric Bayesian arguments rely on entropy integrals, prior mass on sieves, and existence of tests whose validity thresholds depend on n; the abstract gives no indication that the proofs supply fully explicit non-asymptotic constants or avoid an implicit n ≫ 1 step.
[Abstract] Abstract, paragraph 2: the non-identically distributed data model is defined via transformations of λ, but the precise form of these transformations (especially for the quadratic link) must be shown to preserve the entropy and testing conditions used in the contraction proofs; without this, the extension from iid to non-iid cases is not yet verified as load-bearing.

minor comments (1)

[Abstract] The abstract mentions relating hyperparameter choices to the finite-n results, but the main text should include an explicit table or corollary listing the admissible hyperparameter ranges for each model.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address each major comment below and will revise the manuscript accordingly to improve clarity.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that contraction rates hold for certain finite numbers of observations (rather than only asymptotically) is load-bearing for the paper's novelty. Standard nonparametric Bayesian arguments rely on entropy integrals, prior mass on sieves, and existence of tests whose validity thresholds depend on n; the abstract gives no indication that the proofs supply fully explicit non-asymptotic constants or avoid an implicit n ≫ 1 step.

Authors: We agree that the abstract should more explicitly convey the non-asymptotic character of the results. The theorems derive explicit (though symbolic) lower bounds on n in terms of the prior hyperparameters, the covering numbers of the GP, and the constants appearing in the test functions; the contraction statements hold for every finite n satisfying these inequalities, without passage to a limit. The constants are not numerically instantiated because they depend on the unknown true intensity, which is standard. We will revise the abstract to reference these explicit finite-n conditions from the theorems. revision: yes
Referee: [Abstract] Abstract, paragraph 2: the non-identically distributed data model is defined via transformations of λ, but the precise form of these transformations (especially for the quadratic link) must be shown to preserve the entropy and testing conditions used in the contraction proofs; without this, the extension from iid to non-iid cases is not yet verified as load-bearing.

Authors: The logistic and quadratic links are globally Lipschitz and map into a bounded interval; this implies that the Hellinger distance between the induced Poisson measures is equivalent (up to constants) to the L2 distance on the underlying GP, so the entropy integrals carry over directly. The same sieves and test constructions used in the iid case remain valid after a uniform adjustment for the varying intensities. We will insert a short lemma in the appendix that records these preservation arguments explicitly for both links. revision: yes

Circularity Check

0 steps flagged

No circularity: direct theoretical derivation of contraction rates from model assumptions

full rationale

The paper derives posterior contraction rates for the Sigmoidal Gaussian Cox Process and Quadratic Gaussian Cox Process under non-identically distributed observations by applying standard nonparametric Bayesian techniques (entropy integrals, prior mass, tests) to the given models. The abstract and description present these as extensions of existing analyses with explicit finite-n statements under stated conditions; no steps reduce by construction to fitted inputs, self-definitions, or load-bearing self-citations. The central claims remain independent of the paper's own outputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Only the abstract is available; specific free parameters, axioms, and entities cannot be extracted in detail. The paper references choices of hyperparameter/prior and standard properties of Gaussian and Poisson processes.

free parameters (1)

hyperparameters of the prior
The abstract states that results relate particular choices of hyperparameter/prior to the contraction rates.

axioms (1)

standard math Standard properties of Gaussian processes and Poisson processes under the given transformations
Invoked to model λ as logistic or quadratic transformation of a GP and to derive posterior behavior.

pith-pipeline@v0.9.0 · 5756 in / 1280 out tokens · 29479 ms · 2026-05-25T18:56:27.106349+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

[1]

P., Murray, I., and MacKay, D

Adams, R. P., Murray, I., and MacKay, D. J. (2009). Tractable nonparametric B ayesian inference in poisson processes with gaussian process intensities. In Proceedings of the 26th Annual International Conference on Machine Learning , pages 9--16. ACM

work page 2009
[2]

Belitser, E., Serra, P., and van Zanten, H. (2015). Rate-optimal B ayesian intensity smoothing for inhomogeneous poisson processes. Journal of statistical planning and inference , 166:24--35

work page 2015
[3]

Cox, D. R. (1955). Some statistical methods connected with series of events. Journal of the Royal Statistical Society, B , 17(2):129--157

work page 1955
[4]

K., and Van Der Vaart, A

Ghosal, S., Ghosh, J. K., and Van Der Vaart, A. W. (2000). Convergence rates of posterior distributions. Annals of Statistics , 28(2):500--531

work page 2000
[5]

and Van Der Vaart, A

Ghosal, S. and Van Der Vaart, A. (2007). Convergence rates of posterior distributions for non-i.i.d. observations. Annals of Statistics , 35(1):192--223

work page 2007
[6]

and Van Der Vaart, A

Ghosal, S. and Van Der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and B ayes estimation for mixtures of normal densities. Annals of Statistics , 29(5):1233--1263

work page 2001
[7]

A., Boukouvalas, A., Griffiths, R., Leslie, D., Vakili, S., and Munoz de Cot\' e , E

Grant, J. A., Boukouvalas, A., Griffiths, R., Leslie, D., Vakili, S., and Munoz de Cot\' e , E. (2019). Adaptive sensor placement for continuous spaces. In Proceedings of the 36th Annual International Conference on Machine Learning . ACM

work page 2019
[8]

A., Leslie, D

Grant, J. A., Leslie, D. S., Glazebrook, K., Szechtman, R., and Letchford, A. (2018). Adapative policies for perimeter surveillence problems. arXiv preprint arXiv:1810.02176

work page arXiv 2018
[9]

Gugushvili, S., van der Meulen, F., Schauer, M., and Spreij, P. (2018). Fast and scalable non-parametric B ayesian inference for P oisson point processes. arXiv preprint arXiv:1804.03616

work page arXiv 2018
[10]

and Hensman, J

John, S. and Hensman, J. (2018). Large-scale C ox process inference using variational F ourier features. International Conference on Machine Learning

work page 2018
[11]

Karr, A. F. (1986). Inference for stationary random fields given P oisson samples. Advances in Applied Probability , 18(2):406--422

work page 1986
[12]

and Van Zanten, J

Kirichenko, A. and Van Zanten, J. H. (2015). Optimality of P oisson processes intensity learning with G aussian processes. The Journal of Machine Learning Research , 16(1):2909--2919

work page 2015
[13]

Lloyd, C., Gunter, T., Osborne, M., and Roberts, S. (2015). Variational inference for G aussian process modulated poisson processes. In International Conference on Machine Learning , pages 1814--1822

work page 2015
[14]

R., and Waagepetersen, R

M ller, J., Syversveen, A. R., and Waagepetersen, R. P. (1998). Log G aussian C ox processes. Scandinavian Journal of Statistics , 25(3):451--482

work page 1998
[15]

and Waagepetersen, R

M ller, J. and Waagepetersen, R. P. (2003). Statistical inference and simulation for spatial point processes . Chapman and Hall/CRC

work page 2003
[16]

Rathbun, S. L. and Cressie, N. (1994). Asymptotic properties of estimators for the parameters of spatial inhomogeneous P oisson point processes. Advances in Applied Probability , 26(1):122--154

work page 1994
[17]

van der Vaart, A. W. and van Zanten, J. H. (2008). Rates of contraction of posterior distributions based on G aussian process priors. Annals of Statistics , 36(3):1435--1463

work page 2008
[18]

van der Vaart, A. W. and van Zanten, J. H. (2009). Adaptive B ayesian estimation using a G aussian random field with inverse gamma bandwidth. Annals of Statistics , 37(5B):2655--2675

work page 2009
[19]

Williams, C. K. and Rasmussen, C. E. (2006). Gaussian processes for machine learning . Number 3. MIT Press, Cambridge, MA

work page 2006

[1] [1]

P., Murray, I., and MacKay, D

Adams, R. P., Murray, I., and MacKay, D. J. (2009). Tractable nonparametric B ayesian inference in poisson processes with gaussian process intensities. In Proceedings of the 26th Annual International Conference on Machine Learning , pages 9--16. ACM

work page 2009

[2] [2]

Belitser, E., Serra, P., and van Zanten, H. (2015). Rate-optimal B ayesian intensity smoothing for inhomogeneous poisson processes. Journal of statistical planning and inference , 166:24--35

work page 2015

[3] [3]

Cox, D. R. (1955). Some statistical methods connected with series of events. Journal of the Royal Statistical Society, B , 17(2):129--157

work page 1955

[4] [4]

K., and Van Der Vaart, A

Ghosal, S., Ghosh, J. K., and Van Der Vaart, A. W. (2000). Convergence rates of posterior distributions. Annals of Statistics , 28(2):500--531

work page 2000

[5] [5]

and Van Der Vaart, A

Ghosal, S. and Van Der Vaart, A. (2007). Convergence rates of posterior distributions for non-i.i.d. observations. Annals of Statistics , 35(1):192--223

work page 2007

[6] [6]

and Van Der Vaart, A

Ghosal, S. and Van Der Vaart, A. W. (2001). Entropies and rates of convergence for maximum likelihood and B ayes estimation for mixtures of normal densities. Annals of Statistics , 29(5):1233--1263

work page 2001

[7] [7]

A., Boukouvalas, A., Griffiths, R., Leslie, D., Vakili, S., and Munoz de Cot\' e , E

Grant, J. A., Boukouvalas, A., Griffiths, R., Leslie, D., Vakili, S., and Munoz de Cot\' e , E. (2019). Adaptive sensor placement for continuous spaces. In Proceedings of the 36th Annual International Conference on Machine Learning . ACM

work page 2019

[8] [8]

A., Leslie, D

Grant, J. A., Leslie, D. S., Glazebrook, K., Szechtman, R., and Letchford, A. (2018). Adapative policies for perimeter surveillence problems. arXiv preprint arXiv:1810.02176

work page arXiv 2018

[9] [9]

Gugushvili, S., van der Meulen, F., Schauer, M., and Spreij, P. (2018). Fast and scalable non-parametric B ayesian inference for P oisson point processes. arXiv preprint arXiv:1804.03616

work page arXiv 2018

[10] [10]

and Hensman, J

John, S. and Hensman, J. (2018). Large-scale C ox process inference using variational F ourier features. International Conference on Machine Learning

work page 2018

[11] [11]

Karr, A. F. (1986). Inference for stationary random fields given P oisson samples. Advances in Applied Probability , 18(2):406--422

work page 1986

[12] [12]

and Van Zanten, J

Kirichenko, A. and Van Zanten, J. H. (2015). Optimality of P oisson processes intensity learning with G aussian processes. The Journal of Machine Learning Research , 16(1):2909--2919

work page 2015

[13] [13]

Lloyd, C., Gunter, T., Osborne, M., and Roberts, S. (2015). Variational inference for G aussian process modulated poisson processes. In International Conference on Machine Learning , pages 1814--1822

work page 2015

[14] [14]

R., and Waagepetersen, R

M ller, J., Syversveen, A. R., and Waagepetersen, R. P. (1998). Log G aussian C ox processes. Scandinavian Journal of Statistics , 25(3):451--482

work page 1998

[15] [15]

and Waagepetersen, R

M ller, J. and Waagepetersen, R. P. (2003). Statistical inference and simulation for spatial point processes . Chapman and Hall/CRC

work page 2003

[16] [16]

Rathbun, S. L. and Cressie, N. (1994). Asymptotic properties of estimators for the parameters of spatial inhomogeneous P oisson point processes. Advances in Applied Probability , 26(1):122--154

work page 1994

[17] [17]

van der Vaart, A. W. and van Zanten, J. H. (2008). Rates of contraction of posterior distributions based on G aussian process priors. Annals of Statistics , 36(3):1435--1463

work page 2008

[18] [18]

van der Vaart, A. W. and van Zanten, J. H. (2009). Adaptive B ayesian estimation using a G aussian random field with inverse gamma bandwidth. Annals of Statistics , 37(5B):2655--2675

work page 2009

[19] [19]

Williams, C. K. and Rasmussen, C. E. (2006). Gaussian processes for machine learning . Number 3. MIT Press, Cambridge, MA

work page 2006