pith. sign in

arxiv: 1907.08627 · v1 · pith:M7KOQ723new · submitted 2019-07-19 · 🧮 math.ST · stat.ME· stat.TH

Extent of occurrence reconstruction using a new data-driven support estimator

Pith reviewed 2026-05-24 18:48 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH
keywords support estimationr-convex setsdata-driven methodstochastic algorithmconvergence ratesextent of occurrenceset estimation
0
0 comments X

The pith

A data-driven estimator reconstructs the support of a distribution by estimating its r-convexity parameter from samples.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a new method to estimate the probability support S of an unknown distribution from a random sample of points. It assumes that S is r-convex for some unknown r and introduces a stochastic algorithm to estimate this r under mild assumptions on the density. The estimator is then the smallest r-convex set containing all sample points. This data-driven approach achieves the same convergence rates as the standard convex hull estimator, which is limited to convex sets, but applies to a wider class of shapes. The method is demonstrated on reconstructing the extent of occurrence for invasive plant species.

Core claim

The resulting data-driven reconstruction of S attains the same convergence rates as the convex hull for estimating convex sets, but under a much more flexible smoothness shape condition of r-convexity.

What carries the argument

The smallest r-convex set containing the sample points, where r is estimated by a stochastic algorithm from the data.

If this is right

  • The estimator can reconstruct supports that are not convex but r-convex.
  • It achieves optimal convergence rates without assuming convexity.
  • The method applies to ecological data for mapping species extents from point observations.
  • The stochastic algorithm allows practical use by determining r automatically.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach may generalize to other geometric estimation problems where shape parameters need data-driven selection.
  • Similar algorithms could be developed for different shape constraints in set estimation.
  • Performance in high-dimensional data could be explored as an extension.

Load-bearing premise

The support set is r-convex for some unknown r, and the density allows the stochastic algorithm to produce an optimal estimate of r from the sample.

What would settle it

Observing that the data-driven estimator has slower convergence rates than the convex hull in simulations where the set is known to be r-convex would falsify the claim.

Figures

Figures reproduced from arXiv: 1907.08627 by A. Rodr\'iguez-Casal, P. Saavedra-Nieves.

Figure 2
Figure 2. Figure 2: CrpX740q (red color) and Br˚ pxq for r ˚ ě r (gray color) such that Br˚ pxqXX740 “ H taking r “ 0.3 (left) and r “ 5 (right). As it has been mentioned in the Introduction, the problem of reconstruct￾ing a r´convex support S using a data-driven procedure could be easily solved if the parameter r is estimated from a random sample of points Xn taken in S. The first step is to determine precisely the optimal v… view at source ↗
Figure 5
Figure 5. Figure 5: EOO estimator, Crˆ0 pX740q where ˆr0 “ 0.127. Rj ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● 6B;m`2 eX 1PP 2biBK iQ` BM kyR8- UH27iVc 1PP 2biBK iQ` BM kyRe- r?2`2 U+2Mi2`Vc 1PP 2biBK iQ`b BM kyR8 U#Hm2V M/ kyRe U;` vV U`B;?iVX dX *QM+HmbBQMb M/ QT2M T`Q#H2Kb h?2 K BM ;Q H Q7 i?Bb rQ`F Bb iQ T`QTQb2 M2r / i @/`Bp2M K2i?Q/ 7Q` `2+QMbi`m+iBM; +QMp2t bmTTQ`i BM +QMbBbi… view at source ↗
read the original abstract

Given a random sample of points from some unknown distribution, we propose a new data-driven method for estimating its probability support S. Under the mild assumption that S is r-convex, the smallest r-convex set which contains the sample points is the natural estimator. The main problem for using this estimator in practice is that r is an unknown geometric characteristic of the set S. A stochastic algorithm is proposed for determining an optimal estimate of r from the data under mild regularity assumptions on the density function. The resulting data-driven reconstruction of S attains the same convergence rates as the convex hull for estimating convex sets, but under a much more flexible smoothness shape condition. The new support estimator will be used for reconstructing the extent of occurrence of an assemblage of invasive plant species in the Azores archipelago.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes a data-driven estimator for the support S of an unknown distribution, assuming S is r-convex. The estimator is the smallest r-convex set containing the sample points, with r estimated via a stochastic algorithm under mild density regularity conditions. The central claim is that this yields the same convergence rates as the convex hull estimator for convex sets, while allowing a more flexible shape constraint. An application to reconstructing extents of occurrence for invasive plants in the Azores is mentioned.

Significance. If the rate claim holds with a complete proof, the result would extend support estimation theory by relaxing convexity to r-convexity without rate loss, under standard assumptions. The ecological application provides a concrete use case, though its role in validating the rates is unclear from the given text.

major comments (1)
  1. [Abstract] Abstract: the claim that the data-driven reconstruction 'attains the same convergence rates as the convex hull for estimating convex sets' is asserted without derivation, proof sketch, simulation study, or reference to a theorem establishing the rate. This is the load-bearing contribution and cannot be verified from the supplied text.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the detailed review and the opportunity to respond. We address the single major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the data-driven reconstruction 'attains the same convergence rates as the convex hull for estimating convex sets' is asserted without derivation, proof sketch, simulation study, or reference to a theorem establishing the rate. This is the load-bearing contribution and cannot be verified from the supplied text.

    Authors: The abstract summarizes the main theoretical contribution. The convergence rates are established rigorously in Theorem 4.1, which proves that the data-driven estimator for the r-convex support attains the same rate as the convex hull estimator (under the stated density regularity conditions). The complete proof appears in Section 4, which first establishes consistency of the stochastic r-estimator (Proposition 3.2) and then derives the support estimation rate by combining this with the geometric properties of r-convex sets. No simulation study is required because the result is asymptotic and theoretical; the ecological application is presented separately as an illustration. We are happy to add an explicit forward reference to Theorem 4.1 in the abstract. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivation constructs a data-driven estimator for the support S by first using a stochastic algorithm to recover the unknown r from the sample under stated density regularity conditions, then taking the smallest r-convex set containing the points. This produces the claimed convergence rates as a direct consequence of the algorithm and the r-convexity assumption, without any step that defines a quantity in terms of itself, renames a fitted input as a prediction, or reduces the central result to a self-citation chain. The argument remains self-contained against external benchmarks once the algorithm is granted to function as described.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on the geometric assumption of r-convexity and mild density regularity; these are domain assumptions rather than derived results.

axioms (2)
  • domain assumption The unknown support S is r-convex for some r
    This allows the estimator to be defined as the smallest r-convex set containing the sample points.
  • domain assumption The density satisfies mild regularity assumptions
    Required to justify that the stochastic algorithm can select an optimal r from the data.

pith-pipeline@v0.9.0 · 5664 in / 1143 out tokens · 26333 ms · 2026-05-24T18:48:11.390006+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 2 internal anchors

  1. [1]

    A Generalization of the maximal-spacings in several dimensions and a convexity test

    Aaron, C., Cholaquidis, A., Fraiman, R.: On the maximal multivariate spacing extension and convexity tests. ArXiv preprint:1411.2482 (2014)

  2. [2]

    JASA, 1-12 (2018)

    Arias-Castro, E., Pateiro-L´ opez, B., Rodr´ ıguez-Casal, A.: Minimax Es- timation of the Volume of a Set Under the Rolling Ball Condition. JASA, 1-12 (2018)

  3. [3]

    A survey and a new selection criterion for statistical home range estimation

    Ba´ ıllo, A., Chac´ on, J. E.: A survey and a new selection criterion for sta- tistical home range estimation. arXiv preprint arXiv:1804.05129. (2018)

  4. [4]

    Ba´ ıllo, A., Cuevas, A.: On the estimation of a star-shaped set. Adv. in Appl. Probab., 33, 717–726 (2001)

  5. [5]

    R., Cuevas, A., Pateiro-L´ opez, B.: A multivariate unifor- mity test for the case of unknown support

    Berrendero, J. R., Cuevas, A., Pateiro-L´ opez, B.: A multivariate unifor- mity test for the case of unknown support. Stat. Comput., 22, 259–271 (2012)

  6. [6]

    Chevalier, J.: Estimation du support et du contour du support d’une loi de probabilit´ e. Ann. Inst. Henri Poincar´ e Probab. Stat., 12, 339–364 (1976)

  7. [7]

    New perspectives in stochastic geometry, 374–397 (2010)

    Cuevas, A., Fraiman, R.: Set estimation. New perspectives in stochastic geometry, 374–397 (2010)

  8. [8]

    Cuevas, A., Fraiman, R., Pateiro-L´ opez, B.: On statistical properties of sets fulfilling rolling-type conditions. Adv. in Appl. Probab., 44, 311–329 (2012)

  9. [9]

    Cuevas, A., Rodr´ ıguez-Casal, A.: On boundary estimation. Adv. in Appl. Probab., 36, 340–354 (2004)

  10. [10]

    Journal of Applied Probability, 31(3), 700–720 (1994)

    De Haan, L., Resnick, S.: Estimating the home range. Journal of Applied Probability, 31(3), 700–720 (1994)

  11. [11]

    Deheuvels, P.: Strong Bounds for Multidimensional Spacings. Probab. Theory Related Fields, 64 (4), 411–424 (1983)

  12. [12]

    L.: Detection of abnormal behavior via nonpara- metric estimation of the support

    Devroye, L., Wise, G. L.: Detection of abnormal behavior via nonpara- metric estimation of the support. SIAM J. Appl. Math., 38, 480–488 (1980) 32

  13. [13]

    D¨ umbgen, L., Walther, G.: Rates of convergence for random approxi- mations of convex sets. Adv. in Appl. Probab., 28, 384–393 (1996)

  14. [14]

    Berlin, Germany, Springer (2014)

    Edelsbrunner, H.: A short course in computational geometry and topol- ogy. Berlin, Germany, Springer (2014)

  15. [15]

    Chapman and Hall, London (1993)

    Gaston, K.J.: Rarity. Chapman and Hall, London (1993)

  16. [16]

    Ox- ford University Press, Oxford, New York (2003)

    Gaston, K.J.: The Structure and Dynamics of Geographic Ranges. Ox- ford University Press, Oxford, New York (2003)

  17. [17]

    GBIF.org (27th May 2019) GBIF Occurrence Download https://doi.org/10.15468/dl.jtoo0d

  18. [18]

    R., Perone-Pacifico, M., Verdinelli, I., Wasserman, L.: The geometry of nonparametric filament estimation

    Genovese, C. R., Perone-Pacifico, M., Verdinelli, I., Wasserman, L.: The geometry of nonparametric filament estimation. J. Am. Statist. Assoc., 107, 788–799 (2012)

  19. [19]

    Annales de l’Institut Henri Poincare (B)

    Gin´ e, E., Guillou, A.: Rates of strong uniform consistency for multivari- ate kernel density estimators. Annales de l’Institut Henri Poincare (B). Probability and Statistics, 38(6), 907–921 (2002)

  20. [20]

    IUCN Red List Categories and Criteria: Version 3.1

    IUCN. IUCN Red List Categories and Criteria: Version 3.1. Second edi- tion. Gland, Switzerland and Cambridge, UK: IUCN. iv + 32pp (2012)

  21. [21]

    Janson, S.: Maximal spacings in several dimensions. Ann. Probab., 15, 274–280 (1987)

  22. [22]

    N., Butchart, S

    Joppa, L. N., Butchart, S. H., Hoffmann, M., Bachman, S. P., Akakaya, H. R., Moat, J. F., Hughes, A.: Impact of alternative metrics on esti- mates of extent of occurrence for extinction risk assessment. Conserva- tion Biology, 30(2), 362–370 (2016)

  23. [23]

    P., Murthy, C.A.: Selection of alpha for alpha-hull in R2

    Mandal, D. P., Murthy, C.A.: Selection of alpha for alpha-hull in R2. Pattern Recogn., 30, 1759–1767 (1997)

  24. [24]

    Pateiro-L´ opez, B., Rodr´ ıguez-Casal, A.: Generalizing the convex hull of a sample: the R package alphahull. J. Stat. Softw., 34, 1–28 (2010) point cloud in the plane, TEST, 22, 19–45 (2013)

  25. [25]

    Reitzner, M.: Random polytopes and the Efron–Stein jackknife inequal- ity. Ann. Probab., 31, 2136–2166 (2003) 33

  26. [26]

    Rodr´ ıguez-Casal, A.: Set estimation under convexity type assumptions. Ann. Inst. Henri Poincar´ e Probab. Stat., 43, 763–774 (2007)

  27. [27]

    ESAIM: Probability and Statistics, 20, 332-348 (2016)

    Rodr´ ıguez-Casal, A., Saavedra-Nieves, P.: A fully data-driven method for estimating the shape of a point cloud. ESAIM: Probability and Statistics, 20, 332-348 (2016)

  28. [28]

    A., Boitani, L., Grantham, H., Possingham, H

    Rondinini, C., Wilson, K. A., Boitani, L., Grantham, H., Possingham, H. P.: Tradeoffs of different types of species occurrence data for use in systematic conservation planning. Ecology letters, 9, 1136–1145 (2006)

  29. [29]

    Schneider, R.: Random approximation of convex sets. J. Microsc., 151, 211–227 (1988)

  30. [30]

    Cam- bridge University Press (1993)

    Schneider, R.: Convex Bodies: the Brunn-Minkowski Theory. Cam- bridge University Press (1993)

  31. [31]

    Academic Press, London (1982)

    Serra, J.: Image Analysis and Mathematical Morphology. Academic Press, London (1982)

  32. [32]

    Walther, G.: Granulometric smoothing. Ann. Stat., 25, 2273–2299 (1997)

  33. [33]

    Walther, G.: On a generalization of blaschke’s rolling theorem and the smoothing of surfaces. Math. Methods Appl. Sci., 22, 301–316 (1999) 34