Active Measurement of Two-Point Correlations

Daniel Sheldon; Max Hamilton; Subhransu Maji

arxiv: 2604.05227 · v1 · submitted 2026-04-06 · 💻 cs.CV

Active Measurement of Two-Point Correlations

Max Hamilton , Daniel Sheldon , Subhransu Maji This is my paper

Pith reviewed 2026-05-10 18:54 UTC · model grok-4.3

classification 💻 cs.CV

keywords two-point correlation functionactive samplinghuman-in-the-loopunbiased estimationastronomy datapair countsadaptive annotation

0 comments

The pith

A pre-trained classifier guides human annotations to estimate two-point correlations with lower variance than random sampling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an efficient way to measure how selected points cluster in space by estimating their two-point correlation function. It does so with a human-in-the-loop process that uses a pre-trained classifier to choose which points are worth labeling next. After each label, the system updates unbiased counts of pairs falling into different distance bins at once. This yields estimates with substantially less variance than simple random sampling while using far fewer annotations overall.

Core claim

By leveraging a pre-trained classifier to guide sampling, our approach adaptively selects the most informative points for human annotation. After each annotation, it produces unbiased estimates of pair counts across multiple distance bins simultaneously. Compared to simple Monte Carlo approaches, our method achieves substantially lower variance while significantly reducing annotation effort.

What carries the argument

Adaptive sampling strategy guided by a pre-trained classifier together with a novel unbiased estimator for multi-bin pair counts.

If this is right

Large astronomy catalogs become feasible to analyze for clustering properties with limited human labeling.
Unbiased estimates and statistically grounded confidence intervals can be produced for multiple distance scales from the same annotations.
Annotation budgets can be allocated more efficiently by focusing labels on high-information points.
The framework supports simultaneous estimation across bins rather than requiring separate sampling runs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same active-sampling idea could reduce labeling costs for other spatial statistics tasks where only a subset of objects carries the property of interest.
If classifier quality varies across datasets, the method remains unbiased but the reduction in variance would shrink, suggesting a need for classifier calibration checks.
Extending the approach to online settings where new points arrive continuously would test its utility beyond static catalogs.

Load-bearing premise

The pre-trained classifier provides sufficiently accurate guidance on which points are informative without introducing bias into the final pair-count estimates.

What would settle it

Applying the guided sampler to a dataset and observing that the resulting variance in pair-count estimates is not lower than that of uniform Monte Carlo sampling on the same annotations would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2604.05227 by Daniel Sheldon, Max Hamilton, Subhransu Maji.

**Figure 2.** Figure 2: Relative error of the estimated counts for the two galaxies as a function of labeled vertices. The [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Errors across bins with 20% labeled vertices. [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Evaluation of estimated 95% confidence in [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

read the original abstract

Two-point correlation functions (2PCF) are widely used to characterize how points cluster in space. In this work, we study the problem of measuring the 2PCF over a large set of points, restricted to a subset satisfying a property of interest. An example comes from astronomy, where scientists measure the 2PCF of star clusters, which make up only a tiny subset of possible sources within a galaxy. This task typically requires careful labeling of sources to construct catalogs, which is time-consuming. We present a human-in-the-loop framework for efficient estimation of 2PCF of target sources. By leveraging a pre-trained classifier to guide sampling, our approach adaptively selects the most informative points for human annotation. After each annotation, it produces unbiased estimates of pair counts across multiple distance bins simultaneously. Compared to simple Monte Carlo approaches, our method achieves substantially lower variance while significantly reducing annotation effort. We introduce a novel unbiased estimator, sampling strategy, and confidence interval construction that together enable scalable and statistically grounded measurement of two-point correlations in astronomy datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's core move is an active sampling scheme that uses a fixed pre-trained classifier to pick points for labeling while claiming an unbiased multi-bin 2PCF estimator with lower variance than plain Monte Carlo.

read the letter

The colleague should know two things up front. First, the work targets a practical bottleneck in astronomy: estimating two-point correlations only among rare sources that need human labeling to confirm. Second, it tries to do this with adaptive sampling that still produces unbiased pair-count estimates across several distance bins at once, plus confidence intervals, while cutting both variance and total labels compared to uniform random sampling.

Referee Report

2 major / 1 minor

Summary. The paper presents a human-in-the-loop framework for estimating the two-point correlation function (2PCF) of rare target sources within large point sets. It uses a pre-trained classifier to adaptively select informative points for human annotation and introduces a novel estimator that produces unbiased pair-count estimates across multiple distance bins simultaneously, claiming substantially lower variance and reduced annotation effort relative to standard Monte Carlo sampling.

Significance. If the unbiasedness claim holds, the method offers a statistically rigorous way to reduce expensive human labeling while maintaining valid 2PCF estimates, which could be valuable in astronomy and similar domains where clustering statistics of sparse subpopulations must be measured from large catalogs. The simultaneous multi-bin estimation and confidence-interval construction are potentially useful extensions beyond single-bin Monte Carlo approaches.

major comments (2)

[Abstract] Abstract: the central claim that the estimator remains unbiased after each adaptive annotation step requires an explicit derivation showing that the importance weights exactly invert the classifier-induced selection probabilities (which are correlated with the target labels). Without this inversion being closed-form and accounting for sequential updates, the expectation of the estimator will generally deviate from the true 2PCF in one or more bins.
[Methods] The sampling strategy and estimator (described after the abstract) must demonstrate that the adaptive rule does not introduce dependence between the classifier scores and the final weighted pair counts that cannot be corrected; any assumption of independence between classifier outputs and underlying labels would invalidate unbiasedness for the target property.

minor comments (1)

[Abstract] Abstract: the statement 'substantially lower variance' should be supported by a specific quantitative comparison (e.g., variance ratio or effective sample size) or reference to a table/figure once the full experimental section is available.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback on the unbiasedness of our estimator. We agree that an explicit derivation strengthens the presentation and have revised the manuscript to include a detailed proof of the inverse-probability weighting under adaptive sampling. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the estimator remains unbiased after each adaptive annotation step requires an explicit derivation showing that the importance weights exactly invert the classifier-induced selection probabilities (which are correlated with the target labels). Without this inversion being closed-form and accounting for sequential updates, the expectation of the estimator will generally deviate from the true 2PCF in one or more bins.

Authors: We have added a self-contained derivation in the revised Section 3.2. The importance weight for each sampled point is the reciprocal of its selection probability, computed in closed form from the classifier score at the moment of selection. Because the score is observed before the label is revealed, the weight depends only on the (fixed) classifier output and exactly inverts the sampling distribution. We prove by induction over annotation steps that the conditional expectation of the weighted pair count equals the true count given all prior selections; taking the outer expectation yields unconditional unbiasedness for every bin simultaneously. The correlation between scores and labels is exploited for efficiency but is fully corrected by the weights, with no residual bias from sequential dependence. revision: yes
Referee: [Methods] The sampling strategy and estimator (described after the abstract) must demonstrate that the adaptive rule does not introduce dependence between the classifier scores and the final weighted pair counts that cannot be corrected; any assumption of independence between classifier outputs and underlying labels would invalidate unbiasedness for the target property.

Authors: The classifier is pre-trained and held fixed during sampling; its outputs are therefore independent of the yet-unobserved labels. The adaptive selection rule depends solely on these outputs, and the estimator applies inverse-probability weights that are functions of the same outputs. We have inserted a formal lemma in the Methods section showing that the weighted estimator is unbiased for the target 2PCF without requiring independence between scores and labels. The proof relies only on the law of total expectation and the fact that each weight is the exact inverse of the known selection probability conditional on the observed score. This corrects any dependence induced by the adaptive rule. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The provided abstract and context introduce a novel unbiased estimator, adaptive sampling strategy, and confidence interval construction for two-point correlation functions under classifier-guided human annotation. No equations, self-citations, or derivation steps are visible that reduce the claimed unbiasedness or predictions to fitted inputs, self-definitions, or prior author results by construction. The method is presented as statistically independent of the target data via importance weighting or equivalent inversion of selection probabilities. This qualifies as a self-contained contribution with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard statistical assumptions for unbiased estimation under adaptive sampling and on the domain assumption that a pre-trained classifier can be used without corrupting the target distribution. No free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Adaptive sampling guided by a classifier yields unbiased pair-count estimates across distance bins.
Abstract states the estimator is unbiased but provides no derivation; this must be assumed to hold after each annotation update.
domain assumption The pre-trained classifier's predictions correlate with the property of interest without systematic bias in the selected sample.
Central to the efficiency claim; if violated, variance reduction may not materialize or estimates may become biased.

pith-pipeline@v0.9.0 · 5476 in / 1337 out tokens · 34496 ms · 2026-05-10T18:54:43.943547+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

[1]

E., Messa , M., Kim , H., Grasha , K., Cook , D

Adamo , A., Ryon , J. E., Messa , M., Kim , H., Grasha , K., Cook , D. O., Calzetti , D., Lee , J. C., Whitmore , B. C., Elmegreen , B. G., Ubeda , L., Smith , L. J., Bright , S. N., Runnholm , A., Andrews , J. E., Fumagalli , M., Gouliermis , D. A., Kahre , L., Nair , P., Thilker , D., Walterbos , R., Wofford , A., Aloisi , A., Ashworth , G., Brown , T. ...

work page 2017
[2]

N., Bates, S., Fannjiang, C., Jordan, M

Angelopoulos, A. N., Bates, S., Fannjiang, C., Jordan, M. I., and Zrnic, T. (2023). Prediction-powered inference. Science , 382(6671):669--674

work page 2023
[3]

J., Abdalla, F

Banerji, M., Lahav, O., Lintott, C. J., Abdalla, F. B., Schawinski, K., Bamford, S. P., Andreescu, D., Murray, P., Raddick, M. J., Slosar, A., et al. (2010). Galaxy zoo: reproducing galaxy morphologies via machine learning. Monthly Notices of the Royal Astronomical Society , 406(1):342--353

work page 2010
[4]

Bickel, P. J. and Doksum, K. A. (2015). Mathematical Statistics: Basic Ideas and Selected Topics , volume I. Chapman & Hall, 2nd edition

work page 2015
[5]

C., Sabbi , E., Adamo , A., Smith , L

Calzetti , D., Lee , J. C., Sabbi , E., Adamo , A., Smith , L. J., Andrews , J. E., Ubeda , L., Bright , S. N., Thilker , D., Aloisi , A., Brown , T. M., Chandar , R., Christian , C., Cignoni , M., Clayton , G. C., da Silva , R., de Mink , S. E., Dobbs , C., Elmegreen , B. G., Elmegreen , D. M., Evans , A. S., Fumagalli , M., Gallagher , III, J. S., Gouli...

work page 2015
[6]

Citovsky, G., DeSalvo, G., Kumar, S., Ramalingam, S., Rostamizadeh, A., and Wang, Y. (2023). Leveraging importance weights in subset selection. In The Eleventh International Conference on Learning Representations

work page 2023
[7]

G., Gouliermis , D

Grasha , K., Calzetti , D., Adamo , A., Kim , H., Elmegreen , B. G., Gouliermis , D. A., Dale , D. A., Fumagalli , M., Grebel , E. K., Johnson , K. E., Kahre , L., Kennicutt , R. C., Messa , M., Pellerin , A., Ryon , J. E., Smith , L. J., Shabani , F., Thilker , D., and Ubeda , L. (2017a). The Hierarchical Distribution of the Young Stellar Clusters in Six...

work page
[8]

G., Calzetti , D., Adamo , A., Aloisi , A., Bright , S

Grasha , K., Elmegreen , B. G., Calzetti , D., Adamo , A., Aloisi , A., Bright , S. N., Cook , D. O., Dale , D. A., Fumagalli , M., Gallagher , III, J. S., Gouliermis , D. A., Grebel , E. K., Kahre , L., Kim , H., Krumholz , M. R., Lee , J. C., Messa , M., Ryon , J. E., and Ubeda , L. (2017b). Hierarchical Star Formation in Turbulent Media: Evidence from ...

work page
[9]

E., Linden , S

Gregg , B., Calzetti , D., Adamo , A., Bajaj , V., Ryon , J. E., Linden , S. T., Correnti , M., Cignoni , M., Messa , M., Sabbi , E., Gallagher , J. S., Grasha , K., Pedrini , A., Gutermuth , R. A., Melinder , J., Kotulla , R., P \'e rez , G., Krumholz , M. R., Bik , A., \"O stlin , G., Johnson , K. E., Bortolini , G., Smith , L. J., Tosi , M., Maji , S.,...

work page 2024
[10]

Hamilton, M., Lai, J., Zhao, W., Maji, S., and Sheldon, D. (2025). Active measurement: Efficient estimation at scale. In Neural Information Processing Systems (NeurIPS)

work page 2025
[11]

Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association , 47(260):663--685

work page 1952
[12]

Landy , S. D. and Szalay , A. S. (1993). Bias and Variance of Angular Correlation Functions . The Astrophysical Journal , 412:64

work page 1993
[13]

G., Bik, A., Bortolini, G., Buckner, A., Cignoni, M., Correnti, M., Elmegreen, D

Lapeer, D., Calzetti, D., Grasha, K., Adamo, A., Elmegreen, B. G., Bik, A., Bortolini, G., Buckner, A., Cignoni, M., Correnti, M., Elmegreen, D. M., Faustino Vieira, H., Hamilton, M., Johnson, K., Lai, T. S.-Y., Linden, S. T., Maji, S., Messa, M., Östlin, G., Pedrini, A., Sabbi, E., and Smith, L. J. (2026). Feast: Probing hierarchical star formation with ...

work page 2026
[14]

and Loveday, J

Liddle, A. and Loveday, J. (2008). Lick galaxy catalogue

work page 2008
[15]

Meng, C., Liu, E., Neiswanger, W., Song, J., Burke, M., Lobell, D., and Ermon, S. (2022). Is-count: Large-scale object counting from satellite images with covariate-based importance sampling. Proceedings of the AAAI Conference on Artificial Intelligence , 36(11):12034--12042

work page 2022
[16]

K., Shabani , F., Chandar , R., Dale , D

Messa , M., Adamo , A., \"O stlin , G., Calzetti , D., Grasha , K., Grebel , E. K., Shabani , F., Chandar , R., Dale , D. A., Dobbs , C. L., Elmegreen , B. G., Fumagalli , M., Gouliermis , D. A., Kim , H., Smith , L. J., Thilker , D. A., Tosi , M., Ubeda , L., Walterbos , R., Whitmore , B. C., Fedorenko , K., Mahadevan , S., Andrews , J. E., Bright , S. N...

work page 2018
[17]

Midzuno, H. (1952). On the sampling system with probability proportional to sum of sizes. Annals of the Institute of Statistical Mathematics , 3:99--107

work page 1952
[18]

arXiv e-prints, 2509–01670 (2025) https://doi.org/10.48550/arXiv.2509.01670 arXiv:2509.01670 [astro-ph.GA]

Pedrini , A., Adamo , A., Bik , A., Calzetti , D., Linden , S. T., Gregg , B., Bajaj , V., Ryon , J. E., Buckner , A. S. M., Bortolini , G., Cignoni , M., Correnti , M., Duarte-Cabral , A., Elmegreen , B. G., Faustino Vieira , H., Gallagher , J. S., Grasha , K., Johnson , K. E., Krumholz , M. R., Lapeer , D., Lai , T. S. Y., Messa , M., \"O stlin , G., Ro...

work page arXiv 2025
[19]

T., Bajaj , V., Ryon , J

Pedrini , A., Adamo , A., Calzetti , D., Bik , A., Gregg , B., Linden , S. T., Bajaj , V., Ryon , J. E., Ali , A. A., Bortolini , G., Correnti , M., Elmegreen , B. G., Elmegreen , D. M., Gallagher , J. S., Grasha , K., Gutermuth , R. A., Johnson , K. E., Melinder , J., Messa , M., \"O stlin , G., Sabbi , E., Smith , L. J., Tosi , M., and Faustino Vieira ,...

work page 2024
[20]

Peebles, P. J. E. (1975). Statistical analysis of catalogs of extragalactic objects. vi -- the galaxy distribution in the jagellonian field. The Astrophysical Journal , 196:647--651

work page 1975
[21]

Perez, G., Maji, S., and Sheldon, D. (2024a). Discount: Counting in large image collections with detector-based importance sampling. Proceedings of the AAAI Conference on Artificial Intelligence , 38(20):22294--22302

work page
[22]

E., Adamo, A., and Sirressi, M

P \' e rez, G., Messa, M., Calzetti, D., Maji, S., Jung, D. E., Adamo, A., and Sirressi, M. (2021). StarcNet : Machine learning for star cluster identification. The Astrophysical Journal , 907(2):100

work page 2021
[23]

Perez, G., Sheldon, D., Van Horn, G., and Maji, S. (2024b). Human-in-the-loop visual re-id for population size estimation. In European Conference on Computer Vision , pages 185--202. Springer

work page
[24]

Raddick, J., Bracey, G., Gay, P., Lintott, C., Murray, P., Schawinski, K., Szalay, A., and Vandenberg, J. (2009). Galaxy zoo: Exploring the motivations of citizen science volunteers. Astronomy Education Review , 9

work page 2009
[25]

and Gavrikov, V

Sekretenko, O. and Gavrikov, V. (1998). Characterization of the tree spatial distribution in small plots using the pair correlation function. Forest Ecology and Management , 102(2):113--120

work page 1998
[26]

J., and Peebles, P

Seldner, M., Siebers, B., Groth, E. J., and Peebles, P. J. E. (1977). New reduction of the lick catalog of galaxies. The Astronomical Journal , 82:249--256, 313--314

work page 1977
[27]

Sen, A. R. (1952). Present status of probability sampling and its use in the estimation of a characteristic. Econometrica , 20:103

work page 1952
[28]

Shane, C. D. and Wirtanen, C. A. (1967). Publications of the Lick Observatory , volume 22

work page 1967
[29]

K., Wang, T., van Gils, H

Sun, Y., Skidmore, A. K., Wang, T., van Gils, H. A., Wang, Q., Qing, B., and Ding, C. (2014). Reduced dependence of crested ibis on winter-flooded rice fields: Implications for their conservation. PLoS One , 9(5):e98690

work page 2014
[30]

C., Lee, J

Whitmore, B. C., Lee, J. C., Chandar, R., Thilker, D. A., Hannon, S., Wei, W., Huerta, E. A., et al. (2021). Star cluster classification in the phangs--hst survey: Comparison between human and machine learning approaches. Monthly Notices of the Royal Astronomical Society , 506(4):5294--5317

work page 2021
[31]

and Frey, E

Wilhelm, J. and Frey, E. (1996). Radial distribution function of semiflexible polymers. Phys. Rev. Lett. , 77:2581--2584

work page 1996
[32]

Zwicky, F., Herzog, E., and Wild, P. (1968). Catalogue of Galaxies and of Clusters of Galaxies . California Institute of Technology, Pasadena. (CIT)

work page 1968

[1] [1]

E., Messa , M., Kim , H., Grasha , K., Cook , D

Adamo , A., Ryon , J. E., Messa , M., Kim , H., Grasha , K., Cook , D. O., Calzetti , D., Lee , J. C., Whitmore , B. C., Elmegreen , B. G., Ubeda , L., Smith , L. J., Bright , S. N., Runnholm , A., Andrews , J. E., Fumagalli , M., Gouliermis , D. A., Kahre , L., Nair , P., Thilker , D., Walterbos , R., Wofford , A., Aloisi , A., Ashworth , G., Brown , T. ...

work page 2017

[2] [2]

N., Bates, S., Fannjiang, C., Jordan, M

Angelopoulos, A. N., Bates, S., Fannjiang, C., Jordan, M. I., and Zrnic, T. (2023). Prediction-powered inference. Science , 382(6671):669--674

work page 2023

[3] [3]

J., Abdalla, F

Banerji, M., Lahav, O., Lintott, C. J., Abdalla, F. B., Schawinski, K., Bamford, S. P., Andreescu, D., Murray, P., Raddick, M. J., Slosar, A., et al. (2010). Galaxy zoo: reproducing galaxy morphologies via machine learning. Monthly Notices of the Royal Astronomical Society , 406(1):342--353

work page 2010

[4] [4]

Bickel, P. J. and Doksum, K. A. (2015). Mathematical Statistics: Basic Ideas and Selected Topics , volume I. Chapman & Hall, 2nd edition

work page 2015

[5] [5]

C., Sabbi , E., Adamo , A., Smith , L

Calzetti , D., Lee , J. C., Sabbi , E., Adamo , A., Smith , L. J., Andrews , J. E., Ubeda , L., Bright , S. N., Thilker , D., Aloisi , A., Brown , T. M., Chandar , R., Christian , C., Cignoni , M., Clayton , G. C., da Silva , R., de Mink , S. E., Dobbs , C., Elmegreen , B. G., Elmegreen , D. M., Evans , A. S., Fumagalli , M., Gallagher , III, J. S., Gouli...

work page 2015

[6] [6]

Citovsky, G., DeSalvo, G., Kumar, S., Ramalingam, S., Rostamizadeh, A., and Wang, Y. (2023). Leveraging importance weights in subset selection. In The Eleventh International Conference on Learning Representations

work page 2023

[7] [7]

G., Gouliermis , D

Grasha , K., Calzetti , D., Adamo , A., Kim , H., Elmegreen , B. G., Gouliermis , D. A., Dale , D. A., Fumagalli , M., Grebel , E. K., Johnson , K. E., Kahre , L., Kennicutt , R. C., Messa , M., Pellerin , A., Ryon , J. E., Smith , L. J., Shabani , F., Thilker , D., and Ubeda , L. (2017a). The Hierarchical Distribution of the Young Stellar Clusters in Six...

work page

[8] [8]

G., Calzetti , D., Adamo , A., Aloisi , A., Bright , S

Grasha , K., Elmegreen , B. G., Calzetti , D., Adamo , A., Aloisi , A., Bright , S. N., Cook , D. O., Dale , D. A., Fumagalli , M., Gallagher , III, J. S., Gouliermis , D. A., Grebel , E. K., Kahre , L., Kim , H., Krumholz , M. R., Lee , J. C., Messa , M., Ryon , J. E., and Ubeda , L. (2017b). Hierarchical Star Formation in Turbulent Media: Evidence from ...

work page

[9] [9]

E., Linden , S

Gregg , B., Calzetti , D., Adamo , A., Bajaj , V., Ryon , J. E., Linden , S. T., Correnti , M., Cignoni , M., Messa , M., Sabbi , E., Gallagher , J. S., Grasha , K., Pedrini , A., Gutermuth , R. A., Melinder , J., Kotulla , R., P \'e rez , G., Krumholz , M. R., Bik , A., \"O stlin , G., Johnson , K. E., Bortolini , G., Smith , L. J., Tosi , M., Maji , S.,...

work page 2024

[10] [10]

Hamilton, M., Lai, J., Zhao, W., Maji, S., and Sheldon, D. (2025). Active measurement: Efficient estimation at scale. In Neural Information Processing Systems (NeurIPS)

work page 2025

[11] [11]

Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association , 47(260):663--685

work page 1952

[12] [12]

Landy , S. D. and Szalay , A. S. (1993). Bias and Variance of Angular Correlation Functions . The Astrophysical Journal , 412:64

work page 1993

[13] [13]

G., Bik, A., Bortolini, G., Buckner, A., Cignoni, M., Correnti, M., Elmegreen, D

Lapeer, D., Calzetti, D., Grasha, K., Adamo, A., Elmegreen, B. G., Bik, A., Bortolini, G., Buckner, A., Cignoni, M., Correnti, M., Elmegreen, D. M., Faustino Vieira, H., Hamilton, M., Johnson, K., Lai, T. S.-Y., Linden, S. T., Maji, S., Messa, M., Östlin, G., Pedrini, A., Sabbi, E., and Smith, L. J. (2026). Feast: Probing hierarchical star formation with ...

work page 2026

[14] [14]

and Loveday, J

Liddle, A. and Loveday, J. (2008). Lick galaxy catalogue

work page 2008

[15] [15]

Meng, C., Liu, E., Neiswanger, W., Song, J., Burke, M., Lobell, D., and Ermon, S. (2022). Is-count: Large-scale object counting from satellite images with covariate-based importance sampling. Proceedings of the AAAI Conference on Artificial Intelligence , 36(11):12034--12042

work page 2022

[16] [16]

K., Shabani , F., Chandar , R., Dale , D

Messa , M., Adamo , A., \"O stlin , G., Calzetti , D., Grasha , K., Grebel , E. K., Shabani , F., Chandar , R., Dale , D. A., Dobbs , C. L., Elmegreen , B. G., Fumagalli , M., Gouliermis , D. A., Kim , H., Smith , L. J., Thilker , D. A., Tosi , M., Ubeda , L., Walterbos , R., Whitmore , B. C., Fedorenko , K., Mahadevan , S., Andrews , J. E., Bright , S. N...

work page 2018

[17] [17]

Midzuno, H. (1952). On the sampling system with probability proportional to sum of sizes. Annals of the Institute of Statistical Mathematics , 3:99--107

work page 1952

[18] [18]

arXiv e-prints, 2509–01670 (2025) https://doi.org/10.48550/arXiv.2509.01670 arXiv:2509.01670 [astro-ph.GA]

Pedrini , A., Adamo , A., Bik , A., Calzetti , D., Linden , S. T., Gregg , B., Bajaj , V., Ryon , J. E., Buckner , A. S. M., Bortolini , G., Cignoni , M., Correnti , M., Duarte-Cabral , A., Elmegreen , B. G., Faustino Vieira , H., Gallagher , J. S., Grasha , K., Johnson , K. E., Krumholz , M. R., Lapeer , D., Lai , T. S. Y., Messa , M., \"O stlin , G., Ro...

work page arXiv 2025

[19] [19]

T., Bajaj , V., Ryon , J

Pedrini , A., Adamo , A., Calzetti , D., Bik , A., Gregg , B., Linden , S. T., Bajaj , V., Ryon , J. E., Ali , A. A., Bortolini , G., Correnti , M., Elmegreen , B. G., Elmegreen , D. M., Gallagher , J. S., Grasha , K., Gutermuth , R. A., Johnson , K. E., Melinder , J., Messa , M., \"O stlin , G., Sabbi , E., Smith , L. J., Tosi , M., and Faustino Vieira ,...

work page 2024

[20] [20]

Peebles, P. J. E. (1975). Statistical analysis of catalogs of extragalactic objects. vi -- the galaxy distribution in the jagellonian field. The Astrophysical Journal , 196:647--651

work page 1975

[21] [21]

Perez, G., Maji, S., and Sheldon, D. (2024a). Discount: Counting in large image collections with detector-based importance sampling. Proceedings of the AAAI Conference on Artificial Intelligence , 38(20):22294--22302

work page

[22] [22]

E., Adamo, A., and Sirressi, M

P \' e rez, G., Messa, M., Calzetti, D., Maji, S., Jung, D. E., Adamo, A., and Sirressi, M. (2021). StarcNet : Machine learning for star cluster identification. The Astrophysical Journal , 907(2):100

work page 2021

[23] [23]

Perez, G., Sheldon, D., Van Horn, G., and Maji, S. (2024b). Human-in-the-loop visual re-id for population size estimation. In European Conference on Computer Vision , pages 185--202. Springer

work page

[24] [24]

Raddick, J., Bracey, G., Gay, P., Lintott, C., Murray, P., Schawinski, K., Szalay, A., and Vandenberg, J. (2009). Galaxy zoo: Exploring the motivations of citizen science volunteers. Astronomy Education Review , 9

work page 2009

[25] [25]

and Gavrikov, V

Sekretenko, O. and Gavrikov, V. (1998). Characterization of the tree spatial distribution in small plots using the pair correlation function. Forest Ecology and Management , 102(2):113--120

work page 1998

[26] [26]

J., and Peebles, P

Seldner, M., Siebers, B., Groth, E. J., and Peebles, P. J. E. (1977). New reduction of the lick catalog of galaxies. The Astronomical Journal , 82:249--256, 313--314

work page 1977

[27] [27]

Sen, A. R. (1952). Present status of probability sampling and its use in the estimation of a characteristic. Econometrica , 20:103

work page 1952

[28] [28]

Shane, C. D. and Wirtanen, C. A. (1967). Publications of the Lick Observatory , volume 22

work page 1967

[29] [29]

K., Wang, T., van Gils, H

Sun, Y., Skidmore, A. K., Wang, T., van Gils, H. A., Wang, Q., Qing, B., and Ding, C. (2014). Reduced dependence of crested ibis on winter-flooded rice fields: Implications for their conservation. PLoS One , 9(5):e98690

work page 2014

[30] [30]

C., Lee, J

Whitmore, B. C., Lee, J. C., Chandar, R., Thilker, D. A., Hannon, S., Wei, W., Huerta, E. A., et al. (2021). Star cluster classification in the phangs--hst survey: Comparison between human and machine learning approaches. Monthly Notices of the Royal Astronomical Society , 506(4):5294--5317

work page 2021

[31] [31]

and Frey, E

Wilhelm, J. and Frey, E. (1996). Radial distribution function of semiflexible polymers. Phys. Rev. Lett. , 77:2581--2584

work page 1996

[32] [32]

Zwicky, F., Herzog, E., and Wild, P. (1968). Catalogue of Galaxies and of Clusters of Galaxies . California Institute of Technology, Pasadena. (CIT)

work page 1968