CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery
Pith reviewed 2026-06-28 18:47 UTC · model grok-4.3
The pith
Refined annotations from a human-in-the-loop pipeline improve CAFO classification accuracy and generalization across distribution shifts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CAFOSat integrates high-resolution NAIP imagery with multi-source inventories and applies a human-in-the-loop pipeline of AI-assisted annotation, GradCAM-based localization, geometric clustering, and manual verification to convert weak geolocation records into refined, infrastructure-level annotations. The dataset supplies over 45,000 patches across 20 states and four CAFO categories together with curated negative samples obtained through land-cover-guided sampling. Benchmarking across convolutional, transformer, and vision-language models establishes that the refined annotations and negative samples raise classification accuracy and improve generalization under distribution shifts, while a
What carries the argument
The human-in-the-loop pipeline that refines weak geolocation records into infrastructure-level annotations through AI-assisted annotation, GradCAM localization, geometric clustering, and manual verification.
If this is right
- Models achieve higher accuracy when trained on the refined infrastructure-aware labels rather than raw location data.
- Performance remains stronger when test data come from states or imaging conditions different from the training set.
- Infrastructure-level labels allow mapping of specific CAFO components such as barns and manure ponds in addition to overall site detection.
- Synthetic augmentation increases training diversity and supports robustness under distribution shifts.
Where Pith is reading between the lines
- The refinement pipeline could be tested on mapping other variable-layout facilities such as warehouses or solar farms.
- The precise feature annotations may support direct linkage of CAFO infrastructure to environmental or public-health records.
- Applying the same curation steps to multi-temporal imagery could reveal changes in CAFO layout over time.
Load-bearing premise
The multi-source inventories and human-in-the-loop refinement process produce sufficiently accurate and consistent infrastructure-level labels without introducing systematic errors or biases.
What would settle it
If models trained on the refined annotations show no measurable gain in accuracy or generalization compared with models trained on the original weak location records, the claimed benefit of the pipeline would not hold.
Figures
read the original abstract
Concentrated Animal Feeding Operations (CAFOs) play an important role in agricultural production but are also associated with environmental, public health, and disease surveillance concerns. Large-scale mapping of CAFOs from remote sensing imagery remains challenging due to heterogeneous infrastructure layouts, noisy location records, inconsistent annotations, and incomplete inventories. We introduce CAFOSat, a strongly annotated, infrastructure-aware dataset for CAFO mapping across the United States. CAFOSat integrates high-resolution National Agriculture Imagery Program (NAIP) imagery with multi-source CAFO inventories collected across multiple states and transforms weak geolocation records into refined annotations through a human-in-the-loop pipeline combining AI-assisted annotation, GradCAM-based localization, and geometric clustering. To improve dataset quality, we curate challenging negative samples using land-cover-guided sampling with spatial exclusion constraints and provide infrastructure-level annotations, including barns, manure ponds, and grazing-related features, through manual verification. The resulting dataset contains more than 45,000 image patches spanning 20 states and four major CAFO categories. We benchmark a diverse set of convolutional, transformer-based, and vision-language models, demonstrating the value of refined annotations and curated negative samples for CAFO classification and generalization. In addition, we introduce a synthetic augmentation pipeline that generates infrastructure-aware variations to increase training diversity and improve robustness under distribution shifts. CAFOSat provides a large-scale benchmark for advancing infrastructure-aware agricultural monitoring and CAFO mapping from high-resolution remote sensing imagery.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces CAFOSat, a strongly annotated dataset of more than 45,000 NAIP image patches spanning 20 US states and four CAFO categories. It describes a human-in-the-loop pipeline that refines weak multi-source geolocation records into infrastructure-level annotations (barns, manure ponds, grazing features) via AI-assisted annotation, GradCAM localization, geometric clustering, and manual verification; curates challenging negative samples via land-cover-guided sampling; benchmarks convolutional, transformer, and vision-language models; and introduces a synthetic augmentation pipeline to improve robustness under distribution shifts. The central claim is that the refined annotations and curated negatives demonstrably improve CAFO classification accuracy and generalization.
Significance. If the annotation quality is validated and the benchmarks show clear, reproducible gains from the refinement steps and negatives, the dataset would constitute a useful large-scale benchmark for infrastructure-aware CAFO mapping in remote sensing, directly addressing challenges of heterogeneous layouts, noisy inventories, and distribution shifts.
major comments (2)
- [Abstract / §4] Abstract and §4 (Benchmarking): the claim that 'refined annotations and curated negative samples' improve accuracy and generalization is presented without any reported quantitative metrics, ablation tables, error bars, or statistical comparisons. No accuracy/F1 scores, generalization gaps, or baseline contrasts appear in the abstract, making it impossible to evaluate the central empirical claim.
- [§3] §3 (Human-in-the-loop pipeline): no quantitative validation of label quality is provided (e.g., inter-annotator agreement, precision against independent ground-truth inventories, or error rates introduced by GradCAM/geometric clustering steps). This is load-bearing for the assumption that the pipeline yields sufficiently accurate and unbiased infrastructure-level labels.
minor comments (2)
- [Abstract] Abstract: the exact total number of patches and per-category/state breakdown should be stated rather than 'more than 45,000'.
- [§2] §2: clarify the precise definitions and visual criteria used for the four major CAFO categories.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and commit to revisions that directly strengthen the empirical support for our claims.
read point-by-point responses
-
Referee: [Abstract / §4] Abstract and §4 (Benchmarking): the claim that 'refined annotations and curated negative samples' improve accuracy and generalization is presented without any reported quantitative metrics, ablation tables, error bars, or statistical comparisons. No accuracy/F1 scores, generalization gaps, or baseline contrasts appear in the abstract, making it impossible to evaluate the central empirical claim.
Authors: We agree that the abstract contains no quantitative results and that §4 lacks explicit ablation studies isolating the contribution of the refinement pipeline and negative curation (with error bars or statistical tests). While model benchmarks are reported, they do not directly contrast performance before and after these steps. We will add the requested ablation tables, report F1 scores, generalization gaps, and statistical comparisons, and revise the abstract to include the key metrics. revision: yes
-
Referee: [§3] §3 (Human-in-the-loop pipeline): no quantitative validation of label quality is provided (e.g., inter-annotator agreement, precision against independent ground-truth inventories, or error rates introduced by GradCAM/geometric clustering steps). This is load-bearing for the assumption that the pipeline yields sufficiently accurate and unbiased infrastructure-level labels.
Authors: We agree that quantitative validation of annotation quality is necessary to support the pipeline's reliability. The manuscript describes the human-in-the-loop steps and manual verification but reports no inter-annotator agreement, precision metrics, or error analysis for the automated components. We will add these validations, including agreement statistics on a held-out subset and comparisons against available external inventories where feasible. revision: yes
Circularity Check
No significant circularity; empirical dataset construction with independent benchmarking
full rationale
The paper describes dataset creation via multi-source inventories, a human-in-the-loop annotation pipeline (AI-assisted, GradCAM, clustering, manual verification), negative sample curation, and standard benchmarking of conv/transformer/VLM models on the resulting data. No equations, fitted parameters, predictions derived from prior fits, uniqueness theorems, or self-citation chains appear in the provided abstract or description. The central claim rests on empirical performance improvements from refined labels, which are externally falsifiable via the released dataset and benchmarks rather than reducing to self-definition or prior author results by construction. This is a standard data/benchmark paper with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Industrial animal agriculture in the united states: Concentrated animal feeding operations (cafos),
A. Moses and P. Tomaselli, “Industrial animal agriculture in the united states: Concentrated animal feeding operations (cafos),”International farm animal, wildlife and food safety law, pp. 185–214, 2017. 1
2017
-
[2]
Mapping industrial poultry operations at scale with deep learning and aerial imagery,
C. Robinson, B. Chugg, B. Anderson, J. M. L. Ferres, and D. E. Ho, “Mapping industrial poultry operations at scale with deep learning and aerial imagery,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 7458–7471, 2022. 1, 3
2022
-
[3]
Using an adaptive modeling framework to identify avian in- fluenza spillover risk at the wild-domestic interface,
D. J. Prosser, C. M. Kent, J. D. Sullivan, K. A. Patyk, M.- J. McCool, M. K. Torchetti, K. Lantz, and J. M. Mullinax, “Using an adaptive modeling framework to identify avian in- fluenza spillover risk at the wild-domestic interface,”Scien- tific Reports, vol. 14, no. 1, p. 14199, 2024. 1
2024
-
[4]
Water- fowl occurrence and residence time as indicators of h5 and h7 avian influenza in north american poultry,
J. M. Humphreys, A. M. Ramey, D. C. Douglas, J. M. Mul- linax, C. Soos, P. Link, P. Walther, and D. J. Prosser, “Water- fowl occurrence and residence time as indicators of h5 and h7 avian influenza in north american poultry,”Scientific Re- ports, vol. 10, no. 1, p. 2592, 2020
2020
-
[5]
A. Adiga, A. Chopra, M. L. Wilson, S. Ravi, D. Xie, S. Swarup, B. Lewis, R. Raskar, and M. V . Marathe, “A high- resolution, us-scale digital similar of interacting livestock, wild birds, and human ecosystems with applications to multi- host epidemic spread,”arXiv preprint arXiv:2411.01386, 2024
-
[6]
Emergence and interstate spread of highly pathogenic avian influenza a (h5n1) in dairy cattle in the united states,
T.-Q. Nguyen, C. R. Hutter, A. Markin, M. Thomas, K. Lantz, M. L. Killian, G. M. Janzen, S. Vijendran, S. Wa- gle, B. Inderskiet al., “Emergence and interstate spread of highly pathogenic avian influenza a (h5n1) in dairy cattle in the united states,”Science, vol. 388, no. 6745, p. eadq0900,
-
[7]
Gurian-Sherman,CAFOs Uncovered: The Untold Costs of Confined Animal Feeding Operations
D. Gurian-Sherman,CAFOs Uncovered: The Untold Costs of Confined Animal Feeding Operations. Union of Con- cerned Scientists, 2008. 1
2008
-
[8]
Understanding Concentrated Animal Feeding Operations and Their Impact on Communities,
C. Hribar, “Understanding Concentrated Animal Feeding Operations and Their Impact on Communities,” National As- sociation of Local Boards of Health, Tech. Rep., 2010
2010
-
[9]
Deep Learning with Satellite Imagery to Enhance Environmental Enforce- ment,
C. Handan-Nader, D. E. Ho, and L. Y . Liu, “Deep Learning with Satellite Imagery to Enhance Environmental Enforce- ment,” inData Science Applied to Sustainability Analysis. Elsevier, 2021, pp. 205–228. 1
2021
-
[10]
Deep learning to map con- centrated animal feeding operations,
C. Handan-Nader and D. E. Ho, “Deep learning to map con- centrated animal feeding operations,”Nature Sustainability, vol. 2, no. 4, pp. 298–306, 2019. 1, 3, 4, 5, 7, 8
2019
-
[11]
Meter-ml: a multi-sensor earth observation benchmark for automated methane source mapping,
B. Zhu, N. Lui, J. Irvin, J. Le, S. Tadwalkar, C. Wang, Z. Ouyang, F. Y . Liu, A. Y . Ng, and R. B. Jackson, “Meter-ml: a multi-sensor earth observation benchmark for automated methane source mapping,”arXiv preprint arXiv:2207.11166, 2022. 3, 4, 7, 8
-
[12]
Enhancing environmental enforcement with near real-time monitoring: Likelihood-based detection of structural expansion of intensive livestock farms,
B. Chugg, B. Anderson, S. Eicher, S. Lee, and D. E. Ho, “Enhancing environmental enforcement with near real-time monitoring: Likelihood-based detection of structural expansion of intensive livestock farms,”In- ternational Journal of Applied Earth Observation and Geoinformation, vol. 103, p. 102463, 2021. [Online]. Available: https://www.sciencedirect.com/...
2021
-
[13]
Using machine learning to map concentrated animal feeding operations in new mexico,
V . Ehrenpreis, M. Worsham, N. Clarke, and A. Buch- holz, “Using machine learning to map concentrated animal feeding operations in new mexico,” Mapping for En- vironmental Justice, Tech. Rep., January 2021, report prepared for the McGovern Foundation. [Online]. Avail- able: https://mappingforej.studentorg.berkeley.edu/wp- content/uploads/2022/03/NM-CAFO-R...
2021
-
[14]
Machine learning-based identification of animal feeding operations in the united states on a parcel-scale,
A. Saha, B. Rashid, T. Liu, L. Miralha, and R. L. Muenich, “Machine learning-based identification of animal feeding operations in the united states on a parcel-scale,”Science of The Total Environment, vol. 960, p. 178312, 2025. 3
2025
-
[15]
Grad-cam: Visual explanations from deep networks via gradient-based localization,
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” inPro- ceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. 3
2017
-
[16]
Optical remote sensing image understanding with weak supervision: A com- prehensive review,
X. Li, Z. Tang, G.-S. Xia, and L. Zhang, “Optical remote sensing image understanding with weak supervision: A com- prehensive review,”arXiv preprint arXiv:2204.09120, 2022. 3
-
[17]
Deep learning in remote sensing: A com- prehensive review and list of resources,
X. Zhu, D. Tuia, L. Mou, G.-S. Xia, L. Zhang, F. Xu, and F. Fraundorfer, “Deep learning in remote sensing: A com- prehensive review and list of resources,”IEEE Geoscience and Remote Sensing Magazine, vol. 5, no. 4, pp. 8–36, 2017. 3, 6
2017
-
[18]
arXiv preprint arXiv:2302.07685 , year=
T. Chen, X. Li, K. Sohn, Z. Wang, L. Yuan, and H. Zhang, “Diffusion-aug: Unlocking the power of diffusion models for visual recognition,”arXiv preprint arXiv:2302.07685, 2023. 3, 6
-
[19]
Promptdiffusion: Generating large-scale datasets with subject-prompt alignment,
Y . Wang, H. Tan, and H. Yu, “Promptdiffusion: Generating large-scale datasets with subject-prompt alignment,”arXiv preprint arXiv:2308.07974, 2023. 3, 6
-
[20]
Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection
S. Liu, F. Li, H. Wuet al., “Grounding dino: Marrying dino with grounded pre-training for open-set object detection,” arXiv preprint arXiv:2303.05499, 2023. 3, 6
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[21]
High-resolution image synthesis with latent diffusion models,
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Om- mer, “High-resolution image synthesis with latent diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684– 10 695. 3
2022
-
[22]
Diffusionsat: A generative foun- dation model for satellite imagery. arxiv 2023,
S. Khanna, P. Liu, L. Zhou, C. Meng, R. Rombach, M. Burke, D. Lobell, and S. Ermon, “Diffusionsat: A gener- ative foundation model for satellite imagery,”arXiv preprint arXiv:2312.03606, 2023. 3, 6
-
[23]
Geosynth: Contextually-aware high-resolution satellite image synthe- sis,
S. Sastry, S. Khanal, A. Dhakal, and N. Jacobs, “Geosynth: Contextually-aware high-resolution satellite image synthe- sis,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2024, pp. 460–470. 3, 6
2024
-
[24]
Data augmentation in earth observation: A diffusion model ap- proach,
T. A. DE JESUS SOUSA, B. RIES, and N. GUELFI, “Data augmentation in earth observation: A diffusion model ap- proach,” 2024. 3
2024
-
[25]
T. V . Nguyen, J. Hoster, A. Glaser, K. Hildebrand, and F. Biessmann, “Generating synthetic satellite imagery for rare objects: An empirical comparison of models and met- rics,”arXiv preprint arXiv:2409.01138, 2024. 3
-
[26]
Cafomaps: Concentrated animal feed- ing operations in the united states,
Department of Geographical and Sustainability Sciences, University of Iowa, “Cafomaps: Concentrated animal feed- ing operations in the united states,” https://www.cafomaps. org/, 2025, accessed: 2025-05-16. 5, 2
2025
-
[27]
TorchCAM: Class activation explorer,
F.-G. Fernandez, “TorchCAM: Class activation explorer,” https://github.com/frgfm/torch-cam, March 2020. 5
2020
-
[28]
High-resolution image synthesis with latent diffu- sion models,
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Om- mer, “High-resolution image synthesis with latent diffu- sion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10 684–10 695. 6
2022
-
[29]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”CVPR, 2016. 7
2016
-
[30]
Efficientnet: Rethinking model scaling for convolutional neural networks,
M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,”ICML, 2019. 7
2019
-
[31]
A convnet for the 2020s,
Z. Liu, H. Mao, C.-Y . Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,”CVPR, 2022. 7
2022
-
[32]
An im- age is worth 16x16 words: Transformers for image recogni- tion at scale,
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An im- age is worth 16x16 words: Transformers for image recogni- tion at scale,”ICLR, 2021. 7
2021
-
[33]
Swin transformer: Hierarchical vision transformer using shifted windows,
Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,”ICCV, 2021. 7
2021
-
[34]
DINOv2: Learning Robust Visual Features without Supervision
M. Oquab, T. Darcet, T. Moutakanni, A. Ram ´e, D. Haziza, J. Su´arez, M. Szafraniec, Y . Kalantidis, Y . Elkabetz, M. Cord et al., “Dinov2: Learning robust visual features without su- pervision,”arXiv preprint arXiv:2304.07193, 2023. 7
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[35]
Learning transferable visual models from natural language supervision,
A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,”ICML, 2021. 7
2021
-
[36]
Learning from noisy remote sensing data with vision- language models,
A. Mallya, G. Chen, W. Xie, X. Zhu, D. Lobell, and S. Er- mon, “Learning from noisy remote sensing data with vision- language models,” inNeurIPS, 2022. 7
2022
-
[37]
National Agricul- ture Imagery Program (NAIP),
U.S. Department of Agriculture, “National Agricul- ture Imagery Program (NAIP),” https : / / www. fsa . usda . gov/programs- and- services/aerial- photography/imagery- programs / naip - imagery/, 2023, https : / / www. fsa . usda . gov/programs- and- services/aerial- photography/imagery- programs/naip-imagery/. 2
2023
-
[38]
Confined feeding operations - indianamap,
IndianaMap, “Confined feeding operations - indianamap,” https://www.indianamap.org/datasets/INMap::confined- feeding-operations/about, 2024, accessed: 2024-05-15. 2
2024
-
[39]
Iowa animal feeding operations gis data,
I. G. Portal, “Iowa animal feeding operations gis data,” https : / / geodata . iowa . gov / documents / abfbd972640d4e87b6c48dc669775767 / about, 2024, ac- cessed: 2024-05-15
2024
-
[40]
Animal feeding operations map,
M. D. of the Environment, “Animal feeding operations map,” https : / / catalog . data . gov / dataset / maryland - department - of- the- environment- lma- resource- management- program- animal-feeding-oper-5fb42, 2024, accessed: 2024-05-15
2024
-
[41]
Cafo locations - michi- gan department of environment, great lakes, and energy,
Michigan EGLE GIS Hub, “Cafo locations - michi- gan department of environment, great lakes, and energy,” https : / / gis - egle . hub . arcgis . com / datasets / f0843875e5874d04b06396de8200cf75 / explore ? location = 43 . 005451 % 2C - 83 . 963570 % 2C6 . 10, 2024, accessed: 2024-05-15
2024
-
[42]
Minnesota cafo locations (layer 0),
M. G. Commons, “Minnesota cafo locations (layer 0),” https : / / www . arcgis . com / home / item . html ? id = 6d119156229d4e908e22f027bdaee6be&sublayer=0, 2024, accessed: 2024-05-15
2024
-
[43]
New york cafo dataset,
New York Department of State, “New york cafo dataset,” https : / / opdgig . dos . ny . gov / datasets / a9a8eaed80864ab98680899ecdbc1c50 / explore ? location = 42 . 664852 % 2C - 76 . 586900 % 2C7 . 22, 2024, accessed: 2024-05-15
2024
-
[44]
Delaware cafo map viewer,
D. D. of Natural Resources, “Delaware cafo map viewer,” https : / / experience . arcgis . com / experience / c6749f8b31d143cbb38a26fc1b89a2be / page / Delaware, 2024, accessed: 2024-05-15. 2
2024
-
[45]
National land cover database (nlcd),
M.-R. L. C. C. (MRLC), “National land cover database (nlcd),” 2021, accessed: 2024-03-01. [Online]. Available: https://www.mrlc.gov/ 2 CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery Supplementary Material A. CAFOSat Overview. An overview of the data processing pipeline is provided in Figure 2. The...
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.