Bayesian discovery of species in multiple areas

Alessandro Colombi; Federico Camerlenghi; Lucia Paci; Raffaele Argiento

arxiv: 2502.04122 · v3 · submitted 2025-02-06 · 📊 stat.ME

Bayesian discovery of species in multiple areas

Alessandro Colombi , Raffaele Argiento , Federico Camerlenghi , Lucia Paci This is my paper

Pith reviewed 2026-05-23 03:54 UTC · model grok-4.3

classification 📊 stat.ME

keywords species samplingBayesian nonparametricsheterogeneous populationsdistinct speciesshared speciespredictive distributionssample size determinationecological statistics

0 comments

The pith

Bayesian nonparametric priors on two heterogeneous areas yield exact distributions for observed and predicted counts of distinct and shared species.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework for species sampling when observations come from two distinct areas rather than a single homogeneous population. It derives the full distributional theory that describes the numbers of distinct and shared species in any observed sample. The same theory supplies exact predictive distributions for the numbers of unseen distinct and shared species that would appear in one or more additional samples of arbitrary sizes. These predictions also support calculations that determine how large a future sample must be to detect a target number of species. Readers care because the approach handles realistic ecological settings where environments differ and moves beyond frequentist methods restricted to single-step forecasts.

Core claim

By modeling species abundances in each of the two areas with Bayesian nonparametric priors that preserve exchangeability within areas and induce a joint predictive structure across areas, the authors obtain the exact joint distribution of the counts of distinct species in each area and the count of species shared between areas for any finite observed sample. The same construction directly supplies the predictive distributions for the corresponding counts in future samples drawn from the same two areas, for any choice of future sample sizes.

What carries the argument

The distributional theory for in-sample and out-of-sample counts of distinct and shared species induced by Bayesian nonparametric priors on the two-area species abundance measures.

If this is right

Exact predictive distributions become available for the number of unseen distinct species in each area and the number of shared unseen species between areas, for any future sample sizes.
The theory extends one-step-ahead frequentist estimators to arbitrary numbers of future observations and supplies full probability distributions rather than point estimates.
Sample-size calculations are possible for any target number of distinct or shared species to be detected.
In-sample analysis of any finite observed sample yields the joint distribution of distinct and shared species counts without approximation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same construction could be applied to other two-population problems such as estimating unique terms shared between two text corpora.
If the priors can be extended while preserving the predictive structure, the method would apply to three or more areas.
The distributional results could be used to design stratified sampling schemes that account for known habitat differences between areas.

Load-bearing premise

The species abundances in the two areas are generated by Bayesian nonparametric priors that deliver the required within-area exchangeability and cross-area predictive structure for shared and distinct species.

What would settle it

Draw new samples from two areas with known heterogeneity, compute the observed numbers of new distinct and shared species, and check whether those numbers fall inside the probability intervals given by the model's predictive distributions.

Figures

Figures reproduced from arXiv: 2502.04122 by Alessandro Colombi, Federico Camerlenghi, Lucia Paci, Raffaele Argiento.

read the original abstract

In ecology, the description of species composition and biodiversity calls for statistical methods that involve estimating features of interest in unobserved samples based on an observed one. In the last decade, the Bayesian nonparametrics literature has thoroughly investigated the case where data arise from a homogeneous population. In this work, we propose a novel framework to address heterogeneous populations, specifically dealing with scenarios where data arise from two areas. This setting significantly increases the mathematical complexity of the problem and, as a consequence, it has received limited attention in the literature. While early approaches leverage computational methods, we provide a distributional theory for the in-sample analysis of any observed sample and enable out-of-sample prediction for the number of unseen distinct and shared species in additional samples of arbitrary sizes. The latter also extends the frequentist estimators, which solely deal with one-step-ahead prediction. Furthermore, our results can be applied to address sample size determination in sampling problems aimed at detecting distinct and shared species. Our results are illustrated in a real-world dataset concerning a population of ants in the city of Trieste.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper derives exact distributional results and arbitrary-size predictions for distinct and shared species under a two-area heterogeneous BNP model.

read the letter

The main takeaway is that the authors have extended Bayesian nonparametric species sampling to two heterogeneous areas and produced closed distributional results plus predictions for any future sample size from each area. That moves past the one-population case and the one-step frequentist estimators that dominate the literature so far. The Trieste ants example shows how the formulas can be used for sample-size planning to detect new or shared species. Those are the concrete advances visible from the abstract and stress-test note. The work is new in the two-area setting and supplies the kind of exact expressions that let users avoid simulation for basic quantities. The derivations appear to be the central contribution rather than re-packaged earlier results. The modeling choice of BNP priors that deliver the required exchangeability is standard for this subfield and is not hidden. The main limitation is that the abstract does not display the actual proofs or any verification steps, so the cleanliness of the math cannot be checked from the provided material alone. If the full paper contains the derivations and they hold, the gap they fill is real. No internal contradictions or unsupported extrapolations are flagged by the stress test. This is aimed at researchers already working in Bayesian nonparametrics for ecology or biodiversity statistics. A reader who needs multi-site predictions or wants to extend these formulas would find usable results. It is narrow enough that not every statistician needs to read it, but the setting has received limited attention and the claims are specific enough to warrant referee time. I would send it for peer review.

Referee Report

0 major / 2 minor

Summary. The manuscript develops a Bayesian nonparametric framework for species sampling from two heterogeneous areas. It derives exact distributional results for in-sample analysis of observed samples and predictive formulas for the number of unseen distinct and shared species in out-of-sample draws of arbitrary size, extending one-step-ahead frequentist estimators; the results are also applied to sample-size determination and illustrated on an ant population dataset from Trieste.

Significance. If the derivations are correct, the work would be significant for filling a gap in the BNP species-sampling literature: it supplies closed-form distributional and predictive results for the heterogeneous two-area case (previously limited to homogeneous populations or purely computational methods), thereby enabling exact inference, arbitrary-horizon prediction, and design applications without simulation.

minor comments (2)

[Abstract] Abstract: the specific BNP priors (e.g., Dirichlet process, Pitman-Yor, or normalized completely random measures) inducing the required exchangeability are left implicit; stating them explicitly would clarify the modeling assumptions that enable the claimed predictive structure.
[Abstract] The abstract states that 'distributional theory and predictions are provided,' yet the provided text contains no displayed equations, error bounds, or verification steps; ensure the full manuscript includes at least one representative derivation (e.g., the joint distribution of unseen species counts) with a clear statement of the exchangeability assumptions used.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the careful reading, positive summary, and recommendation of minor revision. No specific major comments were raised in the report, so we provide no point-by-point responses below. We will incorporate any minor editorial or typographical suggestions in the revised manuscript.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper derives exact distributional results and predictive formulas for in-sample and out-of-sample species counts under a two-area heterogeneous BNP model. These are presented as new mathematical extensions from the homogeneous case, not as quantities obtained by fitting parameters to data and then renaming them as predictions, nor by self-citation chains that reduce the central claim to prior unverified work by the same authors. The modeling assumptions (exchangeability and predictive structure induced by BNP priors) are fixed inputs, and the contribution consists of independent derivations from those assumptions, making the chain self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; ledger entries are inferred at the level of standard modeling assumptions rather than specific fitted quantities or new entities.

axioms (1)

domain assumption Species abundances in each area follow a Bayesian nonparametric prior permitting exchangeable sampling and predictive distributions for unseen species.
Standard background assumption in Bayesian nonparametrics for species sampling problems, invoked to enable the claimed distributional theory.

pith-pipeline@v0.9.0 · 5710 in / 1124 out tokens · 48399 ms · 2026-05-23T03:54:20.927490+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages

[1]

Abramowitz M, Stegun IA (1964) Handbook of mathematical functions with formulas, graphs, and mathematical tables, National Bureau of Standards Applied Mathematics Series, vol No. 55. U. S. Government Printing Office, Washington, DC, for sale by the Superintendent of Documents

work page 1964
[2]

Ann Statist 50(5):2641--2663

Argiento R, De Iorio M (2022) Is infinity that far? A B ayesian nonparametric perspective of finite mixture models. Ann Statist 50(5):2641--2663

work page 2022
[3]

J Statist Plann Inference 166:14--23, special Issue on Bayesian Nonparametrics

Bacallado S, Favaro S, Trippa L (2015) Bayesian nonparametric inference for shared species richness in multiple populations. J Statist Plann Inference 166:14--23, special Issue on Bayesian Nonparametrics

work page 2015
[4]

Bayesian Anal 1:1--26

Balocchi C, Camerlenghi F, Favaro S (2024) A bayesian nonparametric approach to species sampling problems with ordering. Bayesian Anal 1:1--26

work page 2024
[5]

species-sampling

Balocchi C, Favaro S, Naulet Z (2025) Bayesian nonparametric inference for “species-sampling” problems. Statist Sci (to appear.)

work page 2025
[6]

J Multivariate Anal 156:18--28

Camerlenghi F, Lijoi A, Prünster I (2017) Bayesian prediction with multiple-samples information. J Multivariate Anal 156:18--28

work page 2017
[7]

Ann Statist 47(1):67--92

Camerlenghi F, Lijoi A, Orbanz P, Pr\" u nster I (2019) Distribution theory for hierarchical processes. Ann Statist 47(1):67--92

work page 2019
[8]

J Amer Statist Assoc 119(545):320--331

Camerlenghi F, Favaro S, Masoero L, Broderick T (2024) Scaled process priors for B ayesian nonparametric estimation of the unseen genetic variation. J Amer Statist Assoc 119(545):320--331

work page 2024
[9]

Scand J Stat 11(4):265--270

Chao A (1984) Nonparametric estimation of the number of classes in a population. Scand J Stat 11(4):265--270

work page 1984
[10]

Biometrika 80(1):193--201

Chao A, Yang MCK (1993) Stopping rules and estimation for recapture debugging with unequal failure rates. Biometrika 80(1):193--201

work page 1993
[11]

Statist Sinica 10(1):227--246

Chao A, Hwang WH, Chen YC, Kuo CY (2000) Estimating the number of shared species in two communities. Statist Sinica 10(1):227--246

work page 2000
[12]

Biometrics 62(2):361--371

Chao A, Chazdon RL, Colwell RK, Shen TJ (2006 a ) Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics 62(2):361--371

work page 2006
[13]

Aust N Z J Stat 48(2):117--128

Chao A, Shen TJ, Hwang WH (2006 b ) Application of laplace's boundary-mode approximations to estimate species and shared species richness. Aust N Z J Stat 48(2):117--128

work page 2006
[14]

Species-richness prediction and diversity estimation with R

Chao A, Ma K, Hsieh T, Chiu CH, Chao MA (2016) Package SpadeR . Species-richness prediction and diversity estimation with R

work page 2016
[15]

Ecology 98(11):2914--2929

Chao A, Chiu CH, Colwell RK, Magnago LFS, Chazdon RL, Gotelli NJ (2017) Deciphering the enigma of undetected species, phylogenetic, and functional diversity based on good-turing theory. Ecology 98(11):2914--2929

work page 2017
[16]

CRC Press

Charalambides CA (2002) Enumerative combinatorics. CRC Press

work page 2002
[17]

Environ Ecol Stat 22:759--778

Chuang C, Shen T, Hwang W (2015) Estimating the number of shared species by a jackknife procedure. Environ Ecol Stat 22:759--778

work page 2015
[18]

Bayesian Anal pp 1 -- 29

Colombi A, Argiento R, Camerlenghi F, Paci L (2024) Hierarchical Mixture of Finite Mixtures (with Discussion) . Bayesian Anal pp 1 -- 29

work page 2024
[19]

(2009) Biodiversity: concepts, patterns, and measurement

Colwell RK, et al. (2009) Biodiversity: concepts, patterns, and measurement. The Princeton guide to ecology 663:257--263

work page 2009
[20]

De Blasi P, Favaro S, Lijoi A, Mena RH, Prünster I, Ruggiero M (2015) Are G ibbs-type priors the most natural generalization of the D irichlet process? IEEE Trans Pattern Anal Mach Intell 37(2):212--229

work page 2015
[21]

Efron B, Thisted R (1976) Estimating the number of unsen species: How many words did shakespeare know? Biometrika 63(3):435--447

work page 1976
[22]

J R Stat Soc Ser B Stat Methodol 71(5):993--1008

Favaro S, Lijoi A, Mena RH, Prünster I (2009) Bayesian non-parametric inference for species variety with a two-parameter P oisson– D irichlet process prior. J R Stat Soc Ser B Stat Methodol 71(5):993--1008

work page 2009
[23]

Biometrics 68(4):1188--1196

Favaro S, Lijoi A, Prünster I (2012) A new estimator of the discovery probability. Biometrics 68(4):1188--1196

work page 2012
[24]

Ann Statist 1(2):209--230

Ferguson TS (1973) A bayesian analysis of some nonparametric problems. Ann Statist 1(2):209--230

work page 1973
[25]

Journal of Animal Ecology 12(1):42--58

Fisher RA, Corbet AS, Williams CB (1943) The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology 12(1):42--58

work page 1943
[26]

PhD thesis, Bocconi University

Franzolini B (2022) On dependent processes in B ayesian nonparametrics: Theory, methods, and applications. PhD thesis, Bocconi University

work page 2022
[27]

arXiv : 2408.15806

Ghilotti L, Camerlenghi F, Rigon T (2024) Bayesian analysis of product feature allocation models. arXiv : 2408.15806

work page arXiv 2024
[28]

Electron Comm Probab 15:79 -- 88

Gnedin A (2010) A species sampling model with finitely many types. Electron Comm Probab 15:79 -- 88

work page 2010
[29]

J Math Sci 138:5674--5685

Gnedin A, Pitman J (2006) Exchangeable G ibbs partitions and S tirling triangles. J Math Sci 138:5674--5685

work page 2006
[30]

Biometrika 40(3-4):237--264

Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40(3-4):237--264

work page 1953
[31]

Biometrika 43(1-2):45--63

Good IJ, Toulmin GH (1956) The number of new species, and the increase in population coverage, when a sample is increased. Biometrika 43(1-2):45--63

work page 1956
[32]

Ecology Letters 4(4):379--391

Gotelli NJ, Colwell RK (2001) Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecology Letters 4(4):379--391

work page 2001
[33]

Biometrika 94(4):769--786

Lijoi A, Mena RH, Pr\" u nster I (2007) Bayesian nonparametric estimation of the probability of discovering new species. Biometrika 94(4):769--786

work page 2007
[34]

J Amer Statist Assoc 99(468):1108--1118

Mao CX (2004) Predicting the conditional probability of discovering a new class. J Amer Statist Assoc 99(468):1108--1118

work page 2004
[35]

J Amer Statist Assoc 113(521):340--356

Miller JW, Harrison MT (2018) Mixture models with a prior on the number of components. J Amer Statist Assoc 113(521):340--356

work page 2018
[36]

Mem Fac Sci Kyushu Univ Ser E (Biol) pp 215--235

Morisita M (1959) Measuring of dispersion of individuals and analysis of the distributional patterns. Mem Fac Sci Kyushu Univ Ser E (Biol) pp 215--235

work page 1959
[37]

Statist Probab Lett 7(3):191--194

Nayak TK (1988) A note on estimating the number of errors in a system by recapture sampling. Statist Probab Lett 7(3):191--194

work page 1988
[38]

Oksanen J, Simpson GL, Blanchet FG, Kindt R, Legendre P, Minchin PR, O'Hara R, Solymos P, Stevens MHH, Szoecs E, Wagner H, Barbour M, Bedward M, Bolker B, Borcard D, Borman T, Carvalho G, Chirico M, De Caceres M, Durand S, Evangelista HBA, FitzJohn R, Friendly M, Furneaux B, Hannigan G, Hill MO, Lahti L, McGlinn D, Ouellette MH, Ribeiro Cunha E, Smith T, ...

work page 2024
[39]

Proc Natl Acad Sci USA 113(47):13283--13288

Orlitsky A, Suresh AT, Wu Y (2016) Optimal prediction of the number of unseen species. Proc Natl Acad Sci USA 113(47):13283--13288

work page 2016
[40]

J Agric Biol Environ Stat 14(4):452--468

Pan HY, Chao A, Foissner W (2009) A nonparametric lower bound for the number of species shared by multiple communities. J Agric Biol Environ Stat 14(4):452--468

work page 2009
[41]

Probab Theory Related Fields 102(2):145--158

Pitman J (1995) Exchangeable and partially exchangeable random partitions. Probab Theory Related Fields 102(2):145--158

work page 1995
[42]

Statistics, Probability and Game Theory Papers in honor of David Blackwell 30:245--267

Pitman J (1996) Some developments of the B lackwell- M acqueen urn scheme. Statistics, Probability and Game Theory Papers in honor of David Blackwell 30:245--267

work page 1996
[43]

Statist Sci 37(1):24--41

Quintana FA, M \"u ller P, Jara A, MacEachern SN (2022) The dependent dirichlet process and related models. Statist Sci 37(1):24--41

work page 2022
[44]

J Amer Statist Assoc 74(367):661--667

Rasmussen SL, Starr N (1979) Optimal and adaptive stopping in the search for new species. J Amer Statist Assoc 74(367):661--667

work page 1979
[45]

Nature 688:163

Simpson E (1949) Measurement of diversity. Nature 688:163

work page 1949
[46]

J Amer Statist Assoc 101(476):1566--1581

Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical D irichlet processes. J Amer Statist Assoc 101(476):1566--1581

work page 2006
[47]

J Statist Plann Inference 142(5):1031--1039

Yue JC, Clayton MK (2012) Sequential sampling in the search for new shared species. J Statist Plann Inference 142(5):1031--1039

work page 2012
[48]

Diversity 14(4)

Yue JC, Clayton MK, Hung CR (2022) Comparing nonparametric estimators for the number of shared species in two populations. Diversity 14(4)

work page 2022
[49]

Ecol Indic 125:107538

Zara L, Tordoni E, Castro-Delgado S, Colla A, Maccherini S, Marignani M, Panepinto F, Trittoni M, Bacaro G (2021) Cross-taxon relationships in mediterranean urban ecosystem: A case study from the city of trieste. Ecol Indic 125:107538

work page 2021
[50]

J Amer Statist Assoc 118(544):2521--2532

Zito A, Rigon T, Ovaskainen O, Dunson DB (2023) Bayesian modeling of sequential discoveries. J Amer Statist Assoc 118(544):2521--2532

work page 2023
[51]

, " * write output.state after.block = add.period write newline

ENTRY address archive author booktitle chapter doi edition editor eid eprint howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all ...

work page
[52]

write newline

" write newline "" before.all 'output.state := FUNCTION add.period duplicate empty 'skip "." * add.blank if FUNCTION if.digit duplicate "0" = swap duplicate "1" = swap duplicate "2" = swap duplicate "3" = swap duplicate "4" = swap duplicate "5" = swap duplicate "6" = swap duplicate "7" = swap duplicate "8" = swap "9" = or or or or or or or or or FUNCTION ...

work page

[1] [1]

Abramowitz M, Stegun IA (1964) Handbook of mathematical functions with formulas, graphs, and mathematical tables, National Bureau of Standards Applied Mathematics Series, vol No. 55. U. S. Government Printing Office, Washington, DC, for sale by the Superintendent of Documents

work page 1964

[2] [2]

Ann Statist 50(5):2641--2663

Argiento R, De Iorio M (2022) Is infinity that far? A B ayesian nonparametric perspective of finite mixture models. Ann Statist 50(5):2641--2663

work page 2022

[3] [3]

J Statist Plann Inference 166:14--23, special Issue on Bayesian Nonparametrics

Bacallado S, Favaro S, Trippa L (2015) Bayesian nonparametric inference for shared species richness in multiple populations. J Statist Plann Inference 166:14--23, special Issue on Bayesian Nonparametrics

work page 2015

[4] [4]

Bayesian Anal 1:1--26

Balocchi C, Camerlenghi F, Favaro S (2024) A bayesian nonparametric approach to species sampling problems with ordering. Bayesian Anal 1:1--26

work page 2024

[5] [5]

species-sampling

Balocchi C, Favaro S, Naulet Z (2025) Bayesian nonparametric inference for “species-sampling” problems. Statist Sci (to appear.)

work page 2025

[6] [6]

J Multivariate Anal 156:18--28

Camerlenghi F, Lijoi A, Prünster I (2017) Bayesian prediction with multiple-samples information. J Multivariate Anal 156:18--28

work page 2017

[7] [7]

Ann Statist 47(1):67--92

Camerlenghi F, Lijoi A, Orbanz P, Pr\" u nster I (2019) Distribution theory for hierarchical processes. Ann Statist 47(1):67--92

work page 2019

[8] [8]

J Amer Statist Assoc 119(545):320--331

Camerlenghi F, Favaro S, Masoero L, Broderick T (2024) Scaled process priors for B ayesian nonparametric estimation of the unseen genetic variation. J Amer Statist Assoc 119(545):320--331

work page 2024

[9] [9]

Scand J Stat 11(4):265--270

Chao A (1984) Nonparametric estimation of the number of classes in a population. Scand J Stat 11(4):265--270

work page 1984

[10] [10]

Biometrika 80(1):193--201

Chao A, Yang MCK (1993) Stopping rules and estimation for recapture debugging with unequal failure rates. Biometrika 80(1):193--201

work page 1993

[11] [11]

Statist Sinica 10(1):227--246

Chao A, Hwang WH, Chen YC, Kuo CY (2000) Estimating the number of shared species in two communities. Statist Sinica 10(1):227--246

work page 2000

[12] [12]

Biometrics 62(2):361--371

Chao A, Chazdon RL, Colwell RK, Shen TJ (2006 a ) Abundance-based similarity indices and their estimation when there are unseen species in samples. Biometrics 62(2):361--371

work page 2006

[13] [13]

Aust N Z J Stat 48(2):117--128

Chao A, Shen TJ, Hwang WH (2006 b ) Application of laplace's boundary-mode approximations to estimate species and shared species richness. Aust N Z J Stat 48(2):117--128

work page 2006

[14] [14]

Species-richness prediction and diversity estimation with R

Chao A, Ma K, Hsieh T, Chiu CH, Chao MA (2016) Package SpadeR . Species-richness prediction and diversity estimation with R

work page 2016

[15] [15]

Ecology 98(11):2914--2929

Chao A, Chiu CH, Colwell RK, Magnago LFS, Chazdon RL, Gotelli NJ (2017) Deciphering the enigma of undetected species, phylogenetic, and functional diversity based on good-turing theory. Ecology 98(11):2914--2929

work page 2017

[16] [16]

CRC Press

Charalambides CA (2002) Enumerative combinatorics. CRC Press

work page 2002

[17] [17]

Environ Ecol Stat 22:759--778

Chuang C, Shen T, Hwang W (2015) Estimating the number of shared species by a jackknife procedure. Environ Ecol Stat 22:759--778

work page 2015

[18] [18]

Bayesian Anal pp 1 -- 29

Colombi A, Argiento R, Camerlenghi F, Paci L (2024) Hierarchical Mixture of Finite Mixtures (with Discussion) . Bayesian Anal pp 1 -- 29

work page 2024

[19] [19]

(2009) Biodiversity: concepts, patterns, and measurement

Colwell RK, et al. (2009) Biodiversity: concepts, patterns, and measurement. The Princeton guide to ecology 663:257--263

work page 2009

[20] [20]

De Blasi P, Favaro S, Lijoi A, Mena RH, Prünster I, Ruggiero M (2015) Are G ibbs-type priors the most natural generalization of the D irichlet process? IEEE Trans Pattern Anal Mach Intell 37(2):212--229

work page 2015

[21] [21]

Efron B, Thisted R (1976) Estimating the number of unsen species: How many words did shakespeare know? Biometrika 63(3):435--447

work page 1976

[22] [22]

J R Stat Soc Ser B Stat Methodol 71(5):993--1008

Favaro S, Lijoi A, Mena RH, Prünster I (2009) Bayesian non-parametric inference for species variety with a two-parameter P oisson– D irichlet process prior. J R Stat Soc Ser B Stat Methodol 71(5):993--1008

work page 2009

[23] [23]

Biometrics 68(4):1188--1196

Favaro S, Lijoi A, Prünster I (2012) A new estimator of the discovery probability. Biometrics 68(4):1188--1196

work page 2012

[24] [24]

Ann Statist 1(2):209--230

Ferguson TS (1973) A bayesian analysis of some nonparametric problems. Ann Statist 1(2):209--230

work page 1973

[25] [25]

Journal of Animal Ecology 12(1):42--58

Fisher RA, Corbet AS, Williams CB (1943) The relation between the number of species and the number of individuals in a random sample of an animal population. Journal of Animal Ecology 12(1):42--58

work page 1943

[26] [26]

PhD thesis, Bocconi University

Franzolini B (2022) On dependent processes in B ayesian nonparametrics: Theory, methods, and applications. PhD thesis, Bocconi University

work page 2022

[27] [27]

arXiv : 2408.15806

Ghilotti L, Camerlenghi F, Rigon T (2024) Bayesian analysis of product feature allocation models. arXiv : 2408.15806

work page arXiv 2024

[28] [28]

Electron Comm Probab 15:79 -- 88

Gnedin A (2010) A species sampling model with finitely many types. Electron Comm Probab 15:79 -- 88

work page 2010

[29] [29]

J Math Sci 138:5674--5685

Gnedin A, Pitman J (2006) Exchangeable G ibbs partitions and S tirling triangles. J Math Sci 138:5674--5685

work page 2006

[30] [30]

Biometrika 40(3-4):237--264

Good IJ (1953) The population frequencies of species and the estimation of population parameters. Biometrika 40(3-4):237--264

work page 1953

[31] [31]

Biometrika 43(1-2):45--63

Good IJ, Toulmin GH (1956) The number of new species, and the increase in population coverage, when a sample is increased. Biometrika 43(1-2):45--63

work page 1956

[32] [32]

Ecology Letters 4(4):379--391

Gotelli NJ, Colwell RK (2001) Quantifying biodiversity: procedures and pitfalls in the measurement and comparison of species richness. Ecology Letters 4(4):379--391

work page 2001

[33] [33]

Biometrika 94(4):769--786

Lijoi A, Mena RH, Pr\" u nster I (2007) Bayesian nonparametric estimation of the probability of discovering new species. Biometrika 94(4):769--786

work page 2007

[34] [34]

J Amer Statist Assoc 99(468):1108--1118

Mao CX (2004) Predicting the conditional probability of discovering a new class. J Amer Statist Assoc 99(468):1108--1118

work page 2004

[35] [35]

J Amer Statist Assoc 113(521):340--356

Miller JW, Harrison MT (2018) Mixture models with a prior on the number of components. J Amer Statist Assoc 113(521):340--356

work page 2018

[36] [36]

Mem Fac Sci Kyushu Univ Ser E (Biol) pp 215--235

Morisita M (1959) Measuring of dispersion of individuals and analysis of the distributional patterns. Mem Fac Sci Kyushu Univ Ser E (Biol) pp 215--235

work page 1959

[37] [37]

Statist Probab Lett 7(3):191--194

Nayak TK (1988) A note on estimating the number of errors in a system by recapture sampling. Statist Probab Lett 7(3):191--194

work page 1988

[38] [38]

Oksanen J, Simpson GL, Blanchet FG, Kindt R, Legendre P, Minchin PR, O'Hara R, Solymos P, Stevens MHH, Szoecs E, Wagner H, Barbour M, Bedward M, Bolker B, Borcard D, Borman T, Carvalho G, Chirico M, De Caceres M, Durand S, Evangelista HBA, FitzJohn R, Friendly M, Furneaux B, Hannigan G, Hill MO, Lahti L, McGlinn D, Ouellette MH, Ribeiro Cunha E, Smith T, ...

work page 2024

[39] [39]

Proc Natl Acad Sci USA 113(47):13283--13288

Orlitsky A, Suresh AT, Wu Y (2016) Optimal prediction of the number of unseen species. Proc Natl Acad Sci USA 113(47):13283--13288

work page 2016

[40] [40]

J Agric Biol Environ Stat 14(4):452--468

Pan HY, Chao A, Foissner W (2009) A nonparametric lower bound for the number of species shared by multiple communities. J Agric Biol Environ Stat 14(4):452--468

work page 2009

[41] [41]

Probab Theory Related Fields 102(2):145--158

Pitman J (1995) Exchangeable and partially exchangeable random partitions. Probab Theory Related Fields 102(2):145--158

work page 1995

[42] [42]

Statistics, Probability and Game Theory Papers in honor of David Blackwell 30:245--267

Pitman J (1996) Some developments of the B lackwell- M acqueen urn scheme. Statistics, Probability and Game Theory Papers in honor of David Blackwell 30:245--267

work page 1996

[43] [43]

Statist Sci 37(1):24--41

Quintana FA, M \"u ller P, Jara A, MacEachern SN (2022) The dependent dirichlet process and related models. Statist Sci 37(1):24--41

work page 2022

[44] [44]

J Amer Statist Assoc 74(367):661--667

Rasmussen SL, Starr N (1979) Optimal and adaptive stopping in the search for new species. J Amer Statist Assoc 74(367):661--667

work page 1979

[45] [45]

Nature 688:163

Simpson E (1949) Measurement of diversity. Nature 688:163

work page 1949

[46] [46]

J Amer Statist Assoc 101(476):1566--1581

Teh YW, Jordan MI, Beal MJ, Blei DM (2006) Hierarchical D irichlet processes. J Amer Statist Assoc 101(476):1566--1581

work page 2006

[47] [47]

J Statist Plann Inference 142(5):1031--1039

Yue JC, Clayton MK (2012) Sequential sampling in the search for new shared species. J Statist Plann Inference 142(5):1031--1039

work page 2012

[48] [48]

Diversity 14(4)

Yue JC, Clayton MK, Hung CR (2022) Comparing nonparametric estimators for the number of shared species in two populations. Diversity 14(4)

work page 2022

[49] [49]

Ecol Indic 125:107538

Zara L, Tordoni E, Castro-Delgado S, Colla A, Maccherini S, Marignani M, Panepinto F, Trittoni M, Bacaro G (2021) Cross-taxon relationships in mediterranean urban ecosystem: A case study from the city of trieste. Ecol Indic 125:107538

work page 2021

[50] [50]

J Amer Statist Assoc 118(544):2521--2532

Zito A, Rigon T, Ovaskainen O, Dunson DB (2023) Bayesian modeling of sequential discoveries. J Amer Statist Assoc 118(544):2521--2532

work page 2023

[51] [51]

, " * write output.state after.block = add.period write newline

ENTRY address archive author booktitle chapter doi edition editor eid eprint howpublished institution journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all ...

work page

[52] [52]

write newline

" write newline "" before.all 'output.state := FUNCTION add.period duplicate empty 'skip "." * add.blank if FUNCTION if.digit duplicate "0" = swap duplicate "1" = swap duplicate "2" = swap duplicate "3" = swap duplicate "4" = swap duplicate "5" = swap duplicate "6" = swap duplicate "7" = swap duplicate "8" = swap "9" = or or or or or or or or or FUNCTION ...

work page