pith. sign in

arxiv: 2606.31125 · v1 · pith:MER7RZEAnew · submitted 2026-06-30 · 💻 cs.CV

WildProp: Visual Estimation of Wildlife Body Proportions at Scale

Pith reviewed 2026-07-01 05:56 UTC · model grok-4.3

classification 💻 cs.CV
keywords wildlife morphometricsbody proportionsimage retrievalfoundation modelsecological measurementspose-aware matchingtraining-free estimationpopulation distributions
0
0 comments X

The pith

A single user-annotated image plus web-scale retrieval produces population-level body proportion distributions for arbitrary species with 10-20% median error.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that morphometric estimation can be reframed as a retrieval-driven correspondence task rather than a supervised keypoint detection problem. Given one canonical image with user-marked part endpoints, foundation model features retrieve pose-similar images, dense patch matching transfers the endpoints, geometric consistency filters outliers, and aggregation yields ratio distributions. This approach requires no per-species training data or controlled imaging, enabling scalable measurements from unconstrained web repositories for ecological and evolutionary analyses across taxa.

Core claim

WildProp shows that pose-aware retrieval with foundation model features, followed by dense patch-level matching and geometric filtering, transfers user-defined part endpoints from a single canonical image to many retrieved images, allowing aggregation into stable population-level ratio distributions without species-specific models.

What carries the argument

retrieval-driven correspondence: pose-aware retrieval using foundation model features, dense patch-level matching to transfer endpoints, and geometric consistency filtering before aggregation

If this is right

  • Population morphometrics become feasible from existing image repositories without physical specimen handling or per-species annotation campaigns.
  • Comparative analyses across geography, seasonality, or subgroups can use the same canonical image for multiple taxa.
  • Robust aggregation step reduces sensitivity to individual keypoint or pose errors in retrieved images.
  • The method extends directly to user-defined parts and new taxa by swapping only the canonical image.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same retrieval pipeline could support longitudinal tracking of proportion changes in monitored populations if image timestamps are available.
  • Combining multiple canonical images per species might further reduce variance in the aggregated distributions.
  • If retrieval quality improves with future foundation models, error rates could drop below the current 10-20% range without changing the rest of the pipeline.

Load-bearing premise

Foundation model features produce sufficiently accurate pose matches and endpoint transfers across diverse species and unconstrained images without additional supervision or training.

What would settle it

Measure relative error on a held-out dataset of web images from a new taxon with varied poses; if median error consistently exceeds 25% after aggregation, the retrieval-and-transfer pipeline fails to deliver usable distributions.

Figures

Figures reproduced from arXiv: 2606.31125 by Aaron Sun, Mustafa Chasmai, Subhransu Maji.

Figure 1
Figure 1. Figure 1: Estimating wildlife body proportions. Given a single annotated reference image and a set of specified body parts, WildProp estimates the population-level distribution of their length ratios from a large, uncurated image collection. We show exemplar species and body parts used in our experiments. All images are sourced from iNaturalist [29]; observer IDs are listed below each image. Abstract. Population-lev… view at source ↗
Figure 2
Figure 2. Figure 2: Morphometry paradigms1 . The choice between the measurement method is driven by a tradeoff between measurement costs and estimation errors. illustrating broad applicability across taxa and subgroup analyses across geogra￾phy and seasonality. We conclude with a discussion of limitations and directions for future work. Code available at github:cvl-umass/wildprop. 2 Related Work Morphometrics. Measuring morph… view at source ↗
Figure 3
Figure 3. Figure 3: Method overview. The proposed approach consists of three key components: (1) pose-aware retrieval to identify images from a large collection (e.g., iNaturalist) where the desired measurements can be reliably estimated; (2) keypoint matching and geometric alignment to transfer part annotations; and (3) iterative refinement of key￾point appearance and correspondences to estimate population-level body proport… view at source ↗
Figure 4
Figure 4. Figure 4: Tarsus:Wing-length ratio medians and quartiles across AVONET species. Sample sizes for both WildProp estima￾tions and physical measurements are indi￾cated at the base of each bar. Shorebirds. Tab. 2 shows results on five species from the Alaska Science Center adult shorebird dataset [62]. Using a single reference image per species, we measure four parts: ex￾posed culmen (an alternative measure of bill leng… view at source ↗
Figure 5
Figure 5. Figure 5: Pose availability. Rela￾tive error vs mean retrieval simi￾larity for each species and pose. Pose availability & model performance. The availability of images in suitable poses is an important factor influencing WildProp performance and likely explains several trends observed in our experiments (e.g., Pacific Chorus Frog in the front-view setting, and Warbler and Blue Jay in the bottom-view set￾ting). To in… view at source ↗
Figure 6
Figure 6. Figure 6: Prediction Visualizations. Sample queries and model predictions. We demonstrate the pose-similarity of retrieved images, and matched keypoints on them. from the query image alone results in a 12.2% AVONET relative error. How￾ever, WildProp also provides an estimate of the full distribution, capturing intra-species variation, and supporting cross-species queries (e.g. birds of prey in [PITH_FULL_IMAGE:figu… view at source ↗
Figure 7
Figure 7. Figure 7: Ablations. We present errors for the Great Egret. (a-b): 10 runs mean ± std. the bill-to-wing-length proportion is more stable—as images with these parts clearly visible are abundant—and only begins improving near 100K images. Next, we investigate the influence of query keypoint noise (Fig. 7b). Small errors in the query keypoints are natural for user-provided annotations and may arise from ambiguity in pa… view at source ↗
Figure 8
Figure 8. Figure 8: Case studies. WildProp exhibits broad applicability across taxa and parts. Each panel here presents the estimated distribution of particular body proportions across a few species. Exemplar images and parts are shown for each case study. ture enthusiasts. Each case study focuses on a small set of species within the same taxonomic group and estimates body proportions that distinguish them. Additional case st… view at source ↗
Figure 9
Figure 9. Figure 9: Geographic and Temporal Stratification. WildProp allows measurement of specific sub-populations based on criteria such as geographic location or time-of-year. WildProp can identify geographical and temporal variation. The search corpus in WildProp can be modified to obtain various stratified distri￾butions, and [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗
read the original abstract

Population-level morphometric measurements underpin ecological and evolutionary studies but traditionally require controlled imaging or physical specimen handling, limiting scalability. We present WildProp, a training-free framework that estimates wildlife body proportion distributions directly from large-scale, unconstrained image repositories. We cast morphometric estimation as a retrieval-driven correspondence problem: given a single user-annotated canonical image, WildProp performs pose-aware retrieval using foundation model features, transfers part endpoints via dense patch-level matching, filters predictions using geometric consistency, and aggregates measurements across retrieved images to estimate population-level ratio distributions. Unlike supervised keypoint pipelines, our approach adapts to arbitrary species and user-defined parts without per-species training. Evaluations on three large morphometric datasets spanning birds and amphibians show median relative errors of 10-20%. We further highlight the broad applicability of our approach through a number of case studies measuring various proportions across diverse taxa, including birds, frogs, insects, and flowers. Ablations demonstrate that pose-aware retrieval is critical for stable estimation, while robust aggregation mitigates keypoint and pose noise. Our results indicate that carefully curated 2D correspondences over web-scale imagery can provide scalable morphometric proxies for comparative and subgroup analyses across taxa, geography, and seasonality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces WildProp, a training-free framework for estimating population-level wildlife body proportion distributions from unconstrained web images. It frames the problem as a retrieval-driven correspondence task where, given one user-annotated canonical image, it uses foundation model features for pose-aware retrieval, transfers part endpoints via dense patch matching, applies geometric consistency filtering, and aggregates to obtain ratio distributions. Evaluations on three morphometric datasets for birds and amphibians report median relative errors of 10-20%, with additional case studies on diverse taxa and ablations showing the importance of pose-aware retrieval and robust aggregation.

Significance. If the correspondence transfer via foundation model features proves reliable across taxa and unconstrained imagery, the method has substantial significance for ecology and evolutionary biology by enabling scalable morphometric analyses from existing image repositories without per-species training or controlled imaging. The training-free adaptability to arbitrary species and user-defined parts, combined with the reported error rates, positions it as a practical proxy for comparative studies across geography and seasonality.

major comments (3)
  1. [§4 (Experiments)] §4 (Experiments): The reported median relative errors of 10-20% on three datasets are presented without details on dataset construction, error calculation methodology, baseline comparisons, or explicit handling of pose variation and image quality, making it impossible to verify whether the errors reflect genuine endpoint transfer or artifacts of filtering and aggregation.
  2. [§3 (Method)] §3 (Method): The central claim hinges on dense patch-level matching successfully transferring the two endpoints of each user-defined part; however, no per-image matching success rates, endpoint localization error distributions, or quantitative assessment of correspondence accuracy are provided, leaving the assumption that foundation model features suffice for fine-grained morphometric correspondence across extreme viewpoint/occlusion variation untested.
  3. [§4.3 (Ablations)] §4.3 (Ablations): While ablations confirm that pose-aware retrieval is critical for stable estimation, the geometric consistency filter's effect on the fraction of retained images and any resulting selection bias in the aggregated ratio distributions is not quantified, which directly affects interpretation of the 10-20% error figures.
minor comments (2)
  1. [Abstract] The abstract refers to 'three large morphometric datasets' without naming them or citing their sources.
  2. [Case studies] Case study figures would benefit from explicit visualization of transferred keypoints alongside the canonical annotations to illustrate matching quality.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important aspects of clarity and validation in our work. We address each major comment below and indicate revisions to the manuscript where we can strengthen the presentation without altering the core claims.

read point-by-point responses
  1. Referee: [§4 (Experiments)] The reported median relative errors of 10-20% on three datasets are presented without details on dataset construction, error calculation methodology, baseline comparisons, or explicit handling of pose variation and image quality, making it impossible to verify whether the errors reflect genuine endpoint transfer or artifacts of filtering and aggregation.

    Authors: We agree that expanded details in §4 would aid verification. In the revision we will add: (i) explicit descriptions of how the three morphometric datasets were assembled from existing sources, (ii) the precise relative-error formula used (median of |estimated ratio − ground-truth ratio| / ground-truth ratio), (iii) a short comparison against a non-pose-aware retrieval baseline, and (iv) a paragraph describing how the geometric-consistency filter and image-quality heuristics mitigate pose and quality variation. These additions will be placed in the main text and supplementary material. revision: yes

  2. Referee: [§3 (Method)] The central claim hinges on dense patch-level matching successfully transferring the two endpoints of each user-defined part; however, no per-image matching success rates, endpoint localization error distributions, or quantitative assessment of correspondence accuracy are provided, leaving the assumption that foundation model features suffice for fine-grained morphometric correspondence across extreme viewpoint/occlusion variation untested.

    Authors: We acknowledge the value of intermediate correspondence metrics. However, computing per-image success rates or endpoint localization errors at scale would require dense ground-truth keypoints on thousands of unconstrained web images—an annotation burden that contradicts the training-free premise of the method. We will add a limitations paragraph in §3 explaining this practical constraint and clarifying that population-level ratio error on the three annotated morphometric datasets serves as the primary quantitative validation. We will also report the fraction of images that survive the geometric filter as a proxy for overall matching reliability. revision: partial

  3. Referee: [§4.3 (Ablations)] While ablations confirm that pose-aware retrieval is critical for stable estimation, the geometric consistency filter's effect on the fraction of retained images and any resulting selection bias in the aggregated ratio distributions is not quantified, which directly affects interpretation of the 10-20% error figures.

    Authors: We agree that quantifying retention and potential bias is necessary for proper interpretation. In the revised §4.3 we will report: (i) the percentage of images retained after the geometric consistency filter on each dataset, (ii) a comparison of ratio distributions before and after filtering, and (iii) a brief analysis of whether retained images differ systematically in pose or quality from discarded ones. These statistics will be added to the ablation table and accompanying text. revision: yes

Circularity Check

0 steps flagged

No circularity: retrieval pipeline with independent empirical validation

full rationale

The paper describes a training-free algorithmic pipeline (pose-aware retrieval via foundation features, dense patch matching, geometric filtering, robust aggregation) whose outputs are population-level ratio distributions measured on three separate morphometric datasets. Reported median relative errors (10-20%) are direct empirical comparisons against ground-truth annotations on those datasets, not quantities algebraically defined by the method's own equations or fitted parameters. No self-citations are invoked to justify core premises, no uniqueness theorems are imported, and no ansatz or renaming of known results occurs. The derivation chain consists of standard computer-vision steps whose correctness is tested externally rather than reduced to the inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are stated. Standard computer vision assumptions about feature transferability are implicit but not enumerated.

pith-pipeline@v0.9.1-grok · 5743 in / 1286 out tokens · 22296 ms · 2026-07-01T05:56:47.454769+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

82 extracted references · 10 canonical work pages · 5 internal anchors

  1. [1]

    Remote Sensing in Ecology and Conservation11(4), 438–453 (2025)

    Bagchi, C., Medina, J., Irschick, D.J., Maji, S., Christiansen, F.: Automated ex- traction of right whale morphometric data from drone aerial photographs. Remote Sensing in Ecology and Conservation11(4), 438–453 (2025)

  2. [2]

    Birds of the World (2020)

    Baker, A., Gonzalez, P., Morrison, R.I.G., Harrington, B.A.: Red knot systematics. Birds of the World (2020)

  3. [3]

    Functional ecology30(12), 1894–1903 (2016)

    Bartomeus, I., Gravel, D., Tylianakis, J.M., Aizen, M.A., Dickie, I.A., Bernard- Verdier, M.: A common framework for identifying linkage rules across different types of interactions. Functional ecology30(12), 1894–1903 (2016)

  4. [4]

    In: Proceedings of the IEEE/CVF international conference on computer vision

    Cao, J., Tang, H., Fang, H.S., Shen, X., Lu, C., Tai, Y.W.: Cross-domain adapta- tion for animal pose estimation. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 9498–9507 (2019)

  5. [5]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Chen, X., Chu, F.J., Gleize, P., Liang, K.J., Sax, A., Tang, H., Wang, W., Guo, M., Hardin, T., Li, X., et al.: Sam 3d: 3dfy anything in images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7220–7232 (2026)

  6. [6]

    In: European Conference on Computer Vision

    Cole, E., Wilber, K., Van Horn, G., Yang, X., Fornoni, M., Perona, P., Belongie, S., Howard, A., Aodha, O.M.: On label granularity and object localization. In: European Conference on Computer Vision. pp. 604–620. Springer (2022)

  7. [7]

    Wader Study126(3), 228–235 (2019)

    Conklin, J.R., Verkuil, Y.I., Riegen, A.C., Battley, P.F.: How wry is a wrybill. Wader Study126(3), 228–235 (2019)

  8. [8]

    Global Ecology and Conservation22, e00959 (2020)

    Cooke, R., Rendall, A.R., Weston, M.A., Porch, N., Bradsworth, N., White, J.G.: Photography can determine the sex of a predator with limited sexual dimorphism: A case study of the powerful owl. Global Ecology and Conservation22, e00959 (2020)

  9. [9]

    Nature Ecology & Evolution6(5), 622–629 (2022)

    Cooney, C.R., He, Y., Varley, Z.K., Nouri, L.O., Moody, C.J., Jardine, M.D., Liker, A., Székely, T., Thomas, G.H.: Latitudinal gradients in avian colourfulness. Nature Ecology & Evolution6(5), 622–629 (2022)

  10. [10]

    John Murray, London (1859)

    Darwin, C.: On the Origin of Species by Means of Natural Selection. John Murray, London (1859)

  11. [11]

    Journal of Field Ornithology97(1) (2026)

    Delfino, H.C., Carlos, C.J.: Photogrammetry as a tool for sex identification and sex ratio estimation in chilean flamingos. Journal of Field Ornithology97(1) (2026)

  12. [12]

    Birds of the World (1998)

    Elphic, C.S., Tibbitts, T.L.: Greater yellowlegs. Birds of the World (1998)

  13. [13]

    In: Forty-first international conference on machine learning (2024)

    Esser, P., Kulal, S., Blattmann, A., Entezari, R., Müller, J., Saini, H., Levi, Y., Lorenz, D., Sauer, A., Boesel, F., et al.: Scaling rectified flow transformers for high-resolution image synthesis. In: Forty-first international conference on machine learning (2024)

  14. [14]

    Sensors 24(24), 8122 (2024)

    Fergus, P., Chalmers, C., Matthews, N., Nixon, S., Burger, A., Hartley, O., Suther- land, C., Lambin, X., Longmore, S., Wich, S.: Towards context-rich automated bio- diversity assessments: deriving ai-powered insights from camera trap data. Sensors 24(24), 8122 (2024)

  15. [15]

    Communi- cations of the ACM24(6), 381–395 (1981)

    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communi- cations of the ACM24(6), 381–395 (1981)

  16. [16]

    Systematic biology59(6), 619–633 (2010)

    FitzJohn, R.G.: Quantitative traits and diversification. Systematic biology59(6), 619–633 (2010)

  17. [17]

    Biological Reviews92(2), 1156–1173 (2017) WildProp: Visual Estimation of Wildlife Body Proportions at Scale 17

    Funk, J.L., Larson, J.E., Ames, G.M., Butterfield, B.J., Cavender-Bares, J., Firn, J.,Laughlin,D.C.,Sutton-Grier,A.E.,Williams,L.,Wright,J.:Revisitingtheholy g rail: using plant functional traits to understand ecological processes. Biological Reviews92(2), 1156–1173 (2017) WildProp: Visual Estimation of Wildlife Body Proportions at Scale 17

  18. [18]

    Scientific Data11(1), 694 (2024)

    Gao, S., Yu, W., Tian, T., Lu, Z., Zhang, X., Li, Q., Chen, Y.: A morphological traits dataset of heteroptera sampled in biodiversity priority areas of southwest china. Scientific Data11(1), 694 (2024)

  19. [19]

    Evolutionary Applications12(7), 1385–1401 (2019)

    Geladi, I., De León, L.F., Torchin, M.E., Hendry, A.P., González, R., Sharpe, D.M.: 100-year time series reveal little morphological change following impoundment and predator invasion in two neotropical characids. Evolutionary Applications12(7), 1385–1401 (2019)

  20. [20]

    Evolutionary Ecology39(5), 455–473 (2025)

    Gomez, M.L., Benham, P.M., Bowie, R.C.: Evidence for temporal morphological change in northern but not southern breeding crowned sparrows (zonotrichia spp.). Evolutionary Ecology39(5), 455–473 (2025)

  21. [21]

    elife8, e47994 (2019)

    Graving, J.M., Chae, D., Naik, H., Li, L., Koger, B., Costelloe, B.R., Couzin, I.D.: Deepposekit, a software toolkit for fast and robust animal pose estimation using deep learning. elife8, e47994 (2019)

  22. [22]

    Iscience25(8) (2022)

    Hantak, M.M., Guralnick, R.P., Zare, A., Stucky, B.J.: Computer vision for as- sessing species color pattern variation from web-based community science images. Iscience25(8) (2022)

  23. [23]

    PLoS biology12(4), e1001841 (2014)

    Harfoot, M.B., Newbold, T., Tittensor, D.P., Emmott, S., Hutton, J., Lyutsarev, V., Smith, M.J., Scharlemann, J.P., Purves, D.W.: Emergent global patterns of ecosystem structure and function from a mechanistic general ecosystem model. PLoS biology12(4), e1001841 (2014)

  24. [24]

    PLOS Computational Biology 19(2), e1010933 (2023)

    He, Y., Cooney, C.R., Maddock, S., Thomas, G.H.: Using pose estimation to iden- tify regions and points on natural history specimens. PLOS Computational Biology 19(2), e1010933 (2023)

  25. [25]

    Journal of Evolutionary Biology38(8), 1152–1162 (2025)

    He, Y., Cooney, C.R., Maddock, S., Thomas, G.H.: Phenolearn: a user-friendly toolkit for image annotation and deep learning-based phenotyping for biological datasets. Journal of Evolutionary Biology38(8), 1152–1162 (2025)

  26. [26]

    Biological Conservation242, 108402 (2020)

    Hodgson, J.C., Holman, D., Terauds, A., Koh, L.P., Goldsworthy, S.D.: Rapid con- dition monitoring of an endangered marine vertebrate using precise, non-invasive morphometrics. Biological Conservation242, 108402 (2020)

  27. [27]

    Honsey, A.E., Tingley III, R.W., Anweiler, K.V., Brant, C.O., Farha, S.A., Fe- dorowicz, P.W., Leonhardt, B.S., Eshenroder, R.L.: Demographic, morphometric, and meristic data describing cisco (coregonus artedi) captured in the spanish river, ontario, canada, 15–16 november 2022. U.S. Geological Survey data release (2024). https://doi.org/10.5066/P1N2WFEJ,...

  28. [28]

    SAM 3D Animal: Promptable Animal 3D Reconstruction from Images in the Wild

    Hu, X., Lyu, J., Liu, J., Liu, Y., Zuffi, S., An, L., Goetz, S.: Sam 3d animal: Promptable animal 3d reconstruction from images in the wild. arXiv preprint arXiv:2605.07604 (2026)

  29. [29]

    iNaturalist.https://www.inaturalist.org(2026), accessed on Jan 1, 2026

  30. [30]

    Johnson, N.A., Smith, C.H., Pfeiffer, J.P., Randklev, C.R., Williams, J.D., Austin, J.D.: Molecular and morphological data on two species complexes in the freshwater mussel genus cyclonaias. U.S. Geological Survey data release (2018).https://doi. org/10.5066/P9SRSHV2,https://doi.org/10.5066/P9SRSHV2

  31. [31]

    Ecology90(9), 2648–2648 (2009)

    Jones, K.E., Bielby, J., Cardillo, M., Fritz, S.A., O’Dell, J., Orme, C.D.L., Safi, K., Sechrest, W., Boakes, E.H., Carbone, C., et al.: Pantheria: a species-level database of life history, ecology, and geography of extant and recently extinct mammals: Ecological archives e090-184. Ecology90(9), 2648–2648 (2009)

  32. [32]

    Global change biology26(1), 119–188 (2020) 18 Chasmai et al

    Kattge, J., Bönisch, G., Díaz, S., Lavorel, S., Prentice, I.C., Leadley, P., Tauten- hahn, S., Werner, G.D., Aakala, T., Abedi, M., et al.: Try plant trait database– enhanced coverage and open access. Global change biology26(1), 119–188 (2020) 18 Chasmai et al

  33. [33]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition

    Ke, B., Obukhov, A., Huang, S., Metzger, N., Daudt, R.C., Schindler, K.: Re- purposing diffusion-based image generators for monocular depth estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recog- nition. pp. 9492–9502 (2024)

  34. [34]

    Ecology and Evolution7(10), 3494–3506 (2017)

    Kershaw, J.L., Sherrill, M., Davison, N.J., Brownlow, A., Hall, A.J.: Evaluating morphometric and metabolic markers of body condition in a small cetacean, the harbor porpoise (phocoena phocoena). Ecology and Evolution7(10), 3494–3506 (2017)

  35. [35]

    Journal of Anatomy243(5), 842–859 (2023)

    Kierdorf, U., Gomez, S., Stock, S.R., Antipova, O., Kierdorf, H.: Bone resorption and formation in the pedicles of european roe deer (capreolus capreolus) in rela- tion to the antler cycle—a morphological and microanalytical study. Journal of Anatomy243(5), 842–859 (2023)

  36. [36]

    Conser- vation Physiology13(1), coaf073 (2025)

    Kristiansen, H.H., Metz, M., Silva-Garay, L., Jutfelt, F., Leeuwis, R.H.: Husmorph: a simple machine learning app for automated morphometric landmarking. Conser- vation Physiology13(1), coaf073 (2025)

  37. [37]

    Available:https://labelbox.com(2025), online

    Labelbox. Available:https://labelbox.com(2025), online

  38. [38]

    University of Michigan - Deep Blue Data (2022).https://doi.org/10.7302/abtx- c461, https://doi.org/10.7302/abtx-c461

    Larson, J.G., Weiner, A.: Linear morphological measurements of frogs. University of Michigan - Deep Blue Data (2022).https://doi.org/10.7302/abtx- c461, https://doi.org/10.7302/abtx-c461

  39. [39]

    Journal of ecology102(1), 186–193 (2014)

    Laughlin, D.C.: The intrinsic dimensionality of plant traits and its relevance to community assembly. Journal of ecology102(1), 186–193 (2014)

  40. [40]

    Western Birds56(1), 21–44 (2025)

    Lee, C.T., Birch, A.: Identification of willow and alder flycatchers by primary-tip spacing: The p6: 7 ratio. Western Birds56(1), 21–44 (2025)

  41. [41]

    In: Proceedings of the 28th ACM International Con- ference on Multimedia

    Li, S., Li, J., Tang, H., Qian, R., Lin, W.: Atrw: A benchmark for amur tiger re-identification in the wild. In: Proceedings of the 28th ACM International Con- ference on Multimedia. pp. 2590–2598 (2020)

  42. [42]

    Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

    Liu, S., Zeng, Z., Ren, T., Li, F., Zhang, H., Yang, J., Li, C., Yang, J., Su, H., Zhu, J., et al.: Grounding dino: Marrying dino with grounded pre-training for open-set object detection. arXiv preprint arXiv:2303.05499 (2023)

  43. [43]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Lu, C., Koniusz, P.: Few-shot keypoint detection with uncertainty learning for unseen species. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 19416–19426 (2022)

  44. [44]

    Agriculture14(2), 306 (2024)

    Ma, W., Qi, X., Sun, Y., Gao, R., Ding, L., Wang, R., Peng, C., Zhang, J., Wu, J., Xu, Z., et al.: Computer vision-based measurement techniques for livestock body dimension and weight: A review. Agriculture14(2), 306 (2024)

  45. [45]

    Nature neuroscience21(9), 1281–1289 (2018)

    Mathis, A., Mamidanna, P., Cury, K.M., Abe, T., Murthy, V.N., Mathis, M.W., Bethge, M.: Deeplabcut: markerless pose estimation of user-defined body parts with deep learning. Nature neuroscience21(9), 1281–1289 (2018)

  46. [46]

    Ecology letters13(9), 1085–1093 (2010)

    Mayfield, M.M., Levine, J.M.: Opposing effects of competitive exclusion on the phylogenetic structure of communities. Ecology letters13(9), 1085–1093 (2010)

  47. [47]

    PloS one9(4), e93802 (2014)

    Morfeld,K.A.,Lehnhardt,J.,Alligood,C.,Bolling,J.,Brown,J.L.:Developmentof a body condition scoring index for female african elephants validated by ultrasound measurements of subcutaneous fat. PloS one9(4), e93802 (2014)

  48. [48]

    Ecology96(11), 3109–3109 (2015)

    Myhrvold, N.P., Baldridge, E., Chan, B., Sivam, D., Freeman, D.L., Ernest, S.M.: Anamniotelife-historydatabasetoperformcomparativeanalyseswithbirds,mam- mals, and reptiles: Ecological archives e096-269. Ecology96(11), 3109–3109 (2015)

  49. [49]

    In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06)

    Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06). vol. 2, pp. 2161–2168. Ieee (2006) WildProp: Visual Estimation of Wildlife Body Proportions at Scale 19

  50. [50]

    DINOv2: Learning Robust Visual Features without Supervision

    Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., et al.: Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023)

  51. [51]

    PloS One19(3), e0300253 (2024)

    Panda, A.K., Verma, V., Srivastav, A., Badola, R., Hussain, S.A.: Digital image processing: A new tool for morphological measurements of freshwater turtles under rehabilitation. PloS One19(3), e0300253 (2024)

  52. [52]

    BMC zoology7(1), 35 (2022)

    Pazvant, G., İnce, N.G., Özkan, E., Gündemir, O., Avanus, K., Szara, T.: Sex determination based on morphometric measurements in yellow-legged gulls (larus michahellis) around istanbul. BMC zoology7(1), 35 (2022)

  53. [53]

    Animals14(17), 2453 (2024)

    Peng, C., Cao, S., Li, S., Bai, T., Zhao, Z., Sun, W.: Automated measurement of cattle dimensions using improved keypoint detection combined with unilateral depth imaging. Animals14(17), 2453 (2024)

  54. [54]

    Evolu- tionary Biology51(2), 257–268 (2024)

    Phillips, J.G., Hagey, T.J., Hagemann, M., Gering, E.: Analysis of morphological change during a co-invading assemblage of lizards in the hawaiian islands. Evolu- tionary Biology51(2), 257–268 (2024)

  55. [55]

    IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

    Piccinelli, L., Sakaridis, C., Yang, Y.H., Segu, M., Li, S., Abbeloos, W., Van Gool, L.: Unidepthv2: Universal monocular metric depth estimation made simpler. IEEE Transactions on Pattern Analysis and Machine Intelligence (2025)

  56. [56]

    In: International conference on machine learning

    Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual models from natural language supervision. In: International conference on machine learning. pp. 8748–8763. PmLR (2021)

  57. [57]

    In: Proceedings of the Computer Vision and Pattern Recog- nition Conference

    Rao, J., Zhao, B.N., Wang, Y.: Probabilistic prompt distribution learning for ani- mal pose estimation. In: Proceedings of the Computer Vision and Pattern Recog- nition Conference. pp. 29438–29447 (2025)

  58. [58]

    SAM 2: Segment Anything in Images and Videos

    Ravi, N., Gabeur, V., Hu, Y.T., Hu, R., Ryali, C., Ma, T., Khedr, H., Rädle, R., Rolland, C., Gustafson, L., et al.: Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024)

  59. [59]

    The Auk97(2), 321–338 (1980)

    Ricklefs, R.E., Travis, J.: A morphological approach to the study of avian commu- nity organization. The Auk97(2), 321–338 (1980)

  60. [60]

    Bird Observer29(4), 311–317 (2001)

    Roberts, P.M.: Books and other resources on hawk identification, hawkwatching, and migration. Bird Observer29(4), 311–317 (2001)

  61. [61]

    Wildlife Research51(1), WR23026 (2023)

    Russell, G., Christiansen, F., Colefax, A., Sprogis, K.R., Cagnazzi, D.: Compar- isons of morphometrics and body condition between two breeding populations of australian humpback whales. Wildlife Research51(1), WR23026 (2023)

  62. [62]

    5066/P9KNRWXB

    Ruthrauff, D.R., Tibbitts, T.L., Gill, Robert E., J., Handel, C.M.: Adult shorebird morphologicalmeasurementdata(ver.3.0,november2024).U.S.GeologicalSurvey data release (2023).https://doi.org/10.5066/P9KNRWXB,https://doi.org/10. 5066/P9KNRWXB

  63. [63]

    Ecography38(4), 380–392 (2015)

    Schleuning, M., Fründ, J., García, D.: Predicting ecosystem functions from biodi- versity and mutualistic networks: an extension of trait-based concepts to plant– animal interactions. Ecography38(4), 380–392 (2015)

  64. [64]

    CRC Press (1986)

    Silverman, B.W.: Density Estimation for Statistics and Data Analysis. CRC Press (1986)

  65. [65]

    DINOv3

    Siméoni, O., Vo, H.V., Seitzer, M., Baldassarre, F., Oquab, M., Jose, C., Khali- dov, V., Szafraniec, M., Yi, S., Ramamonjisoa, M., et al.: Dinov3. arXiv preprint arXiv:2508.10104 (2025)

  66. [66]

    In: Proceedings ninth IEEE international conference on computer vision

    Sivic, Zisserman: Video google: A text retrieval approach to object matching in videos. In: Proceedings ninth IEEE international conference on computer vision. pp. 1470–1477. IEEE (2003) 20 Chasmai et al

  67. [67]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Stevens, S., Wu, J., Thompson, M.J., Campolongo, E.G., Song, C.H., Carlyn, D.E., Dong, L., Dahdul, W.M., Stewart, C., Berger-Wolf, T., et al.: Bioclip: A vision foundation model for the tree of life. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 19412–19424 (2024)

  68. [68]

    Ecology letters25(3), 581–597 (2022)

    Tobias, J.A., Sheard, C., Pigot, A.L., Devenish, A.J., Yang, J., Sayol, F., Neate- Clegg, M.H., Alioravainen, N., Weeks, T.L., Barber, R.A., et al.: Avonet: mor- phological, ecological and geographical data for all birds. Ecology letters25(3), 581–597 (2022)

  69. [69]

    Tuia, D., Beery, S., Costelloe, B.R., Oliver, R.Y., Lecomte, N.: Towards ‘digital ecology’: Advances in integrating artificial intelligence from data generation to ecological insight (2026)

  70. [70]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Van Horn, G., Mac Aodha, O., Song, Y., Cui, Y., Sun, C., Shepard, A., Adam, H., Perona, P., Belongie, S.: The inaturalist species classification and detection dataset. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 8769–8778 (2018)

  71. [71]

    Advances in Neural Information Processing Systems37, 126500–126514 (2024)

    Vendrow, E., Pantazis, O., Shepard, A., Brostow, G., Jones, K.E., Mac Aodha, O., Beery, S., Van Horn, G.: Inquire: A natural world text-to-image retrieval bench- mark. Advances in Neural Information Processing Systems37, 126500–126514 (2024)

  72. [72]

    Birds of the World (2023)

    Warnock, N., Gill, R.E.: Dunlin systematics. Birds of the World (2023)

  73. [73]

    Methods in Ecology and Evolution14(2), 347–359 (2023)

    Weeks, B.C., Zhou, Z., O’Brien, B.K., Darling, R., Dean, M., Dias, T., Hassena, G., Zhang, M., Fouhey, D.F.: A deep neural network for high-throughput measure- ment of functional traits on museum skeletal specimens. Methods in Ecology and Evolution14(2), 347–359 (2023)

  74. [74]

    Journal of Ornithology161(1), 333–339 (2020)

    Williams, H.M., Wilcox, S.B., Patterson, A.J.: Photography as a tool for avian morphometric measurements. Journal of Ornithology161(1), 333–339 (2020)

  75. [75]

    Ecology95(7), 2027–2027 (2014)

    Wilman, H., Belmaker, J., Simpson, J., De La Rosa, C., Rivadeneira, M.M., Jetz, W.: Eltontraits 1.0: Species-level foraging attributes of the world’s birds and mam- mals: Ecological archives e095-178. Ecology95(7), 2027–2027 (2014)

  76. [76]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Xiang, J., Chen, X., Xu, S., Wang, R., Lv, Z., Deng, Y., Zhu, H., Dong, Y., Zhao, H., Yuan, N.J., et al.: Native and compact structured latents for 3d generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 14419–14429 (2026)

  77. [77]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Xiang, J., Lv, Z., Xu, S., Deng, Y., Wang, R., Zhang, B., Chen, D., Tong, X., Yang, J.: Structured 3d latents for scalable and versatile 3d generation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 21469–21480 (2025)

  78. [78]

    arXiv preprint arXiv:2508.08783 (2025)

    Xiong, T., Tan, D., Tian, W.: Diffpose-animal: A language-conditioned diffusion framework for animal pose estimation. arXiv preprint arXiv:2508.08783 (2025)

  79. [79]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Yang, L., Kang, B., Huang, Z., Xu, X., Feng, J., Zhao, H.: Depth anything: Un- leashing the power of large-scale unlabeled data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10371–10381 (2024)

  80. [80]

    Nature Ecol- ogy & Evolution6(12), 1860–1870 (2022)

    Youngflesh,C.,Saracco,J.F.,Siegel,R.B.,Tingley,M.W.:Abioticconditionsshape spatial and temporal morphological variation in north american birds. Nature Ecol- ogy & Evolution6(12), 1860–1870 (2022)

Showing first 80 references.