CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery

Abhijin Adiga; Madhav Marathe; Mandy L Wilson; Nibir Chandra Mandal; Oishee Bintey Hoque; Samarth Swarup

arxiv: 2606.00548 · v1 · pith:RQCIPCZXnew · submitted 2026-05-30 · 💻 cs.CV · cs.AI· cs.LG

CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery

Oishee Bintey Hoque , Nibir Chandra Mandal , Mandy L Wilson , Samarth Swarup , Madhav Marathe , Abhijin Adiga This is my paper

Pith reviewed 2026-06-28 18:47 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LG

keywords CAFO mappingremote sensing datasetinfrastructure annotationhuman-in-the-loopsatellite imagerydeep learning benchmarksagricultural monitoringnegative sample curation

0 comments

The pith

Refined annotations from a human-in-the-loop pipeline improve CAFO classification accuracy and generalization across distribution shifts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CAFOSat, a dataset of more than 45,000 image patches spanning 20 states that turns noisy CAFO location records into detailed infrastructure annotations. A human-in-the-loop process combines AI-assisted labeling, GradCAM localization, geometric clustering, and manual verification to produce labels for barns, manure ponds, and grazing features while also curating challenging negative samples. Benchmark tests on convolutional, transformer, and vision-language models show that these refinements raise classification performance and help models handle shifts between regions. The work also adds a synthetic augmentation step to generate infrastructure variations that increase training diversity.

Core claim

CAFOSat integrates high-resolution NAIP imagery with multi-source inventories and applies a human-in-the-loop pipeline of AI-assisted annotation, GradCAM-based localization, geometric clustering, and manual verification to convert weak geolocation records into refined, infrastructure-level annotations. The dataset supplies over 45,000 patches across 20 states and four CAFO categories together with curated negative samples obtained through land-cover-guided sampling. Benchmarking across convolutional, transformer, and vision-language models establishes that the refined annotations and negative samples raise classification accuracy and improve generalization under distribution shifts, while a

What carries the argument

The human-in-the-loop pipeline that refines weak geolocation records into infrastructure-level annotations through AI-assisted annotation, GradCAM localization, geometric clustering, and manual verification.

If this is right

Models achieve higher accuracy when trained on the refined infrastructure-aware labels rather than raw location data.
Performance remains stronger when test data come from states or imaging conditions different from the training set.
Infrastructure-level labels allow mapping of specific CAFO components such as barns and manure ponds in addition to overall site detection.
Synthetic augmentation increases training diversity and supports robustness under distribution shifts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The refinement pipeline could be tested on mapping other variable-layout facilities such as warehouses or solar farms.
The precise feature annotations may support direct linkage of CAFO infrastructure to environmental or public-health records.
Applying the same curation steps to multi-temporal imagery could reveal changes in CAFO layout over time.

Load-bearing premise

The multi-source inventories and human-in-the-loop refinement process produce sufficiently accurate and consistent infrastructure-level labels without introducing systematic errors or biases.

What would settle it

If models trained on the refined annotations show no measurable gain in accuracy or generalization compared with models trained on the original weak location records, the claimed benefit of the pipeline would not hold.

Figures

Figures reproduced from arXiv: 2606.00548 by Abhijin Adiga, Madhav Marathe, Mandy L Wilson, Nibir Chandra Mandal, Oishee Bintey Hoque, Samarth Swarup.

**Figure 1.** Figure 1: Overview of CAFOSat. (Top left) Raw survey coordinates (red pins) serve as initial patch centers; patches with no visible CAFO infrastructure are excluded (×). (Top center) The refinement pipeline produces strongly annotated coordinates (green pins) centered on confirmed CAFO infrastructure. (Top right) State-wise CAFO location counts by livestock type and patch counts across Total, Verified, and Verified … view at source ↗

**Figure 2.** Figure 2: Data processing pipeline for CAFOSat. (Top) An AI-Annotator trained on verified patches is used to pre-filter CAFO candidates for manual annotation. (Middle) Weakly annotated locations are refined via five overlapping patch extractions, GradCAM activation mapping, contour extraction, and single-linkage clustering to produce spatially accurate coordinates; 4,513 samples are then manually verified with inf… view at source ↗

read the original abstract

Concentrated Animal Feeding Operations (CAFOs) play an important role in agricultural production but are also associated with environmental, public health, and disease surveillance concerns. Large-scale mapping of CAFOs from remote sensing imagery remains challenging due to heterogeneous infrastructure layouts, noisy location records, inconsistent annotations, and incomplete inventories. We introduce CAFOSat, a strongly annotated, infrastructure-aware dataset for CAFO mapping across the United States. CAFOSat integrates high-resolution National Agriculture Imagery Program (NAIP) imagery with multi-source CAFO inventories collected across multiple states and transforms weak geolocation records into refined annotations through a human-in-the-loop pipeline combining AI-assisted annotation, GradCAM-based localization, and geometric clustering. To improve dataset quality, we curate challenging negative samples using land-cover-guided sampling with spatial exclusion constraints and provide infrastructure-level annotations, including barns, manure ponds, and grazing-related features, through manual verification. The resulting dataset contains more than 45,000 image patches spanning 20 states and four major CAFO categories. We benchmark a diverse set of convolutional, transformer-based, and vision-language models, demonstrating the value of refined annotations and curated negative samples for CAFO classification and generalization. In addition, we introduce a synthetic augmentation pipeline that generates infrastructure-aware variations to increase training diversity and improve robustness under distribution shifts. CAFOSat provides a large-scale benchmark for advancing infrastructure-aware agricultural monitoring and CAFO mapping from high-resolution remote sensing imagery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CAFOSat is a new dataset with infrastructure-level CAFO annotations and a human-in-the-loop pipeline, but the abstract gives no numbers on whether the refinements actually improve model performance.

read the letter

The main takeaway is that this paper ships CAFOSat, a dataset of more than 45,000 NAIP patches across 20 states with labels for barns, manure ponds, and related features, built by turning weak inventory points into refined annotations via AI-assisted steps, GradCAM, clustering, and manual checks, plus land-cover-guided negatives and a synthetic augmentation step.

What is actually new is the infrastructure-aware annotation scheme and the specific negative sampling strategy that avoids easy backgrounds. The multi-source inventory fusion and the augmentation pipeline for distribution shifts are also concrete additions to the remote-sensing dataset literature. Benchmarking conv nets, transformers, and vision-language models on the data is the expected next step and shows the authors engaged with current model families.

The soft spot is that the abstract claims the refined labels and negatives improve accuracy and generalization but supplies zero metrics, ablations, or agreement scores against independent ground truth. Without those numbers it is impossible to tell whether the human-in-the-loop step removes noise or introduces its own biases. The full paper presumably contains the tables; if the gains are small or the validation is only internal, the practical value drops.

This work is aimed at people building large-scale agricultural monitoring tools who need better training data for one land-use class. A reader already working on CAFO or similar facility mapping will find the curation details useful even if they end up re-annotating parts of it.

It deserves peer review because the dataset artifact is new and the pipeline is described at a level that can be reproduced or extended, but any referee should insist on seeing the quantitative results before acceptance.

Referee Report

2 major / 2 minor

Summary. The paper introduces CAFOSat, a strongly annotated dataset of more than 45,000 NAIP image patches spanning 20 US states and four CAFO categories. It describes a human-in-the-loop pipeline that refines weak multi-source geolocation records into infrastructure-level annotations (barns, manure ponds, grazing features) via AI-assisted annotation, GradCAM localization, geometric clustering, and manual verification; curates challenging negative samples via land-cover-guided sampling; benchmarks convolutional, transformer, and vision-language models; and introduces a synthetic augmentation pipeline to improve robustness under distribution shifts. The central claim is that the refined annotations and curated negatives demonstrably improve CAFO classification accuracy and generalization.

Significance. If the annotation quality is validated and the benchmarks show clear, reproducible gains from the refinement steps and negatives, the dataset would constitute a useful large-scale benchmark for infrastructure-aware CAFO mapping in remote sensing, directly addressing challenges of heterogeneous layouts, noisy inventories, and distribution shifts.

major comments (2)

[Abstract / §4] Abstract and §4 (Benchmarking): the claim that 'refined annotations and curated negative samples' improve accuracy and generalization is presented without any reported quantitative metrics, ablation tables, error bars, or statistical comparisons. No accuracy/F1 scores, generalization gaps, or baseline contrasts appear in the abstract, making it impossible to evaluate the central empirical claim.
[§3] §3 (Human-in-the-loop pipeline): no quantitative validation of label quality is provided (e.g., inter-annotator agreement, precision against independent ground-truth inventories, or error rates introduced by GradCAM/geometric clustering steps). This is load-bearing for the assumption that the pipeline yields sufficiently accurate and unbiased infrastructure-level labels.

minor comments (2)

[Abstract] Abstract: the exact total number of patches and per-category/state breakdown should be stated rather than 'more than 45,000'.
[§2] §2: clarify the precise definitions and visual criteria used for the four major CAFO categories.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and commit to revisions that directly strengthen the empirical support for our claims.

read point-by-point responses

Referee: [Abstract / §4] Abstract and §4 (Benchmarking): the claim that 'refined annotations and curated negative samples' improve accuracy and generalization is presented without any reported quantitative metrics, ablation tables, error bars, or statistical comparisons. No accuracy/F1 scores, generalization gaps, or baseline contrasts appear in the abstract, making it impossible to evaluate the central empirical claim.

Authors: We agree that the abstract contains no quantitative results and that §4 lacks explicit ablation studies isolating the contribution of the refinement pipeline and negative curation (with error bars or statistical tests). While model benchmarks are reported, they do not directly contrast performance before and after these steps. We will add the requested ablation tables, report F1 scores, generalization gaps, and statistical comparisons, and revise the abstract to include the key metrics. revision: yes
Referee: [§3] §3 (Human-in-the-loop pipeline): no quantitative validation of label quality is provided (e.g., inter-annotator agreement, precision against independent ground-truth inventories, or error rates introduced by GradCAM/geometric clustering steps). This is load-bearing for the assumption that the pipeline yields sufficiently accurate and unbiased infrastructure-level labels.

Authors: We agree that quantitative validation of annotation quality is necessary to support the pipeline's reliability. The manuscript describes the human-in-the-loop steps and manual verification but reports no inter-annotator agreement, precision metrics, or error analysis for the automated components. We will add these validations, including agreement statistics on a held-out subset and comparisons against available external inventories where feasible. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical dataset construction with independent benchmarking

full rationale

The paper describes dataset creation via multi-source inventories, a human-in-the-loop annotation pipeline (AI-assisted, GradCAM, clustering, manual verification), negative sample curation, and standard benchmarking of conv/transformer/VLM models on the resulting data. No equations, fitted parameters, predictions derived from prior fits, uniqueness theorems, or self-citation chains appear in the provided abstract or description. The central claim rests on empirical performance improvements from refined labels, which are externally falsifiable via the released dataset and benchmarks rather than reducing to self-definition or prior author results by construction. This is a standard data/benchmark paper with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper introduces no mathematical derivations, fitted parameters, or new physical entities. It relies on standard remote-sensing imagery sources and established computer-vision techniques for annotation and benchmarking.

pith-pipeline@v0.9.1-grok · 5824 in / 1238 out tokens · 23304 ms · 2026-06-28T18:47:50.344070+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 9 canonical work pages · 2 internal anchors

[1]

Industrial animal agriculture in the united states: Concentrated animal feeding operations (cafos),

A. Moses and P. Tomaselli, “Industrial animal agriculture in the united states: Concentrated animal feeding operations (cafos),”International farm animal, wildlife and food safety law, pp. 185–214, 2017. 1

2017
[2]

Mapping industrial poultry operations at scale with deep learning and aerial imagery,

C. Robinson, B. Chugg, B. Anderson, J. M. L. Ferres, and D. E. Ho, “Mapping industrial poultry operations at scale with deep learning and aerial imagery,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 7458–7471, 2022. 1, 3

2022
[3]

Using an adaptive modeling framework to identify avian in- fluenza spillover risk at the wild-domestic interface,

D. J. Prosser, C. M. Kent, J. D. Sullivan, K. A. Patyk, M.- J. McCool, M. K. Torchetti, K. Lantz, and J. M. Mullinax, “Using an adaptive modeling framework to identify avian in- fluenza spillover risk at the wild-domestic interface,”Scien- tific Reports, vol. 14, no. 1, p. 14199, 2024. 1

2024
[4]

Water- fowl occurrence and residence time as indicators of h5 and h7 avian influenza in north american poultry,

J. M. Humphreys, A. M. Ramey, D. C. Douglas, J. M. Mul- linax, C. Soos, P. Link, P. Walther, and D. J. Prosser, “Water- fowl occurrence and residence time as indicators of h5 and h7 avian influenza in north american poultry,”Scientific Re- ports, vol. 10, no. 1, p. 2592, 2020

2020
[5]

A high- resolution, us-scale digital similar of interacting livestock, wild birds, and human ecosystems with applications to multi- host epidemic spread,

A. Adiga, A. Chopra, M. L. Wilson, S. Ravi, D. Xie, S. Swarup, B. Lewis, R. Raskar, and M. V . Marathe, “A high- resolution, us-scale digital similar of interacting livestock, wild birds, and human ecosystems with applications to multi- host epidemic spread,”arXiv preprint arXiv:2411.01386, 2024

work page arXiv 2024
[6]

Emergence and interstate spread of highly pathogenic avian influenza a (h5n1) in dairy cattle in the united states,

T.-Q. Nguyen, C. R. Hutter, A. Markin, M. Thomas, K. Lantz, M. L. Killian, G. M. Janzen, S. Vijendran, S. Wa- gle, B. Inderskiet al., “Emergence and interstate spread of highly pathogenic avian influenza a (h5n1) in dairy cattle in the united states,”Science, vol. 388, no. 6745, p. eadq0900,
[7]

Gurian-Sherman,CAFOs Uncovered: The Untold Costs of Confined Animal Feeding Operations

D. Gurian-Sherman,CAFOs Uncovered: The Untold Costs of Confined Animal Feeding Operations. Union of Con- cerned Scientists, 2008. 1

2008
[8]

Understanding Concentrated Animal Feeding Operations and Their Impact on Communities,

C. Hribar, “Understanding Concentrated Animal Feeding Operations and Their Impact on Communities,” National As- sociation of Local Boards of Health, Tech. Rep., 2010

2010
[9]

Deep Learning with Satellite Imagery to Enhance Environmental Enforce- ment,

C. Handan-Nader, D. E. Ho, and L. Y . Liu, “Deep Learning with Satellite Imagery to Enhance Environmental Enforce- ment,” inData Science Applied to Sustainability Analysis. Elsevier, 2021, pp. 205–228. 1

2021
[10]

Deep learning to map con- centrated animal feeding operations,

C. Handan-Nader and D. E. Ho, “Deep learning to map con- centrated animal feeding operations,”Nature Sustainability, vol. 2, no. 4, pp. 298–306, 2019. 1, 3, 4, 5, 7, 8

2019
[11]

Meter-ml: a multi-sensor earth observation benchmark for automated methane source mapping,

B. Zhu, N. Lui, J. Irvin, J. Le, S. Tadwalkar, C. Wang, Z. Ouyang, F. Y . Liu, A. Y . Ng, and R. B. Jackson, “Meter-ml: a multi-sensor earth observation benchmark for automated methane source mapping,”arXiv preprint arXiv:2207.11166, 2022. 3, 4, 7, 8

work page arXiv 2022
[12]

Enhancing environmental enforcement with near real-time monitoring: Likelihood-based detection of structural expansion of intensive livestock farms,

B. Chugg, B. Anderson, S. Eicher, S. Lee, and D. E. Ho, “Enhancing environmental enforcement with near real-time monitoring: Likelihood-based detection of structural expansion of intensive livestock farms,”In- ternational Journal of Applied Earth Observation and Geoinformation, vol. 103, p. 102463, 2021. [Online]. Available: https://www.sciencedirect.com/...

2021
[13]

Using machine learning to map concentrated animal feeding operations in new mexico,

V . Ehrenpreis, M. Worsham, N. Clarke, and A. Buch- holz, “Using machine learning to map concentrated animal feeding operations in new mexico,” Mapping for En- vironmental Justice, Tech. Rep., January 2021, report prepared for the McGovern Foundation. [Online]. Avail- able: https://mappingforej.studentorg.berkeley.edu/wp- content/uploads/2022/03/NM-CAFO-R...

2021
[14]

Machine learning-based identification of animal feeding operations in the united states on a parcel-scale,

A. Saha, B. Rashid, T. Liu, L. Miralha, and R. L. Muenich, “Machine learning-based identification of animal feeding operations in the united states on a parcel-scale,”Science of The Total Environment, vol. 960, p. 178312, 2025. 3

2025
[15]

Grad-cam: Visual explanations from deep networks via gradient-based localization,

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” inPro- ceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. 3

2017
[16]

Optical remote sensing image understanding with weak supervision: A com- prehensive review,

X. Li, Z. Tang, G.-S. Xia, and L. Zhang, “Optical remote sensing image understanding with weak supervision: A com- prehensive review,”arXiv preprint arXiv:2204.09120, 2022. 3

work page arXiv 2022
[17]

Deep learning in remote sensing: A com- prehensive review and list of resources,

X. Zhu, D. Tuia, L. Mou, G.-S. Xia, L. Zhang, F. Xu, and F. Fraundorfer, “Deep learning in remote sensing: A com- prehensive review and list of resources,”IEEE Geoscience and Remote Sensing Magazine, vol. 5, no. 4, pp. 8–36, 2017. 3, 6

2017
[18]

arXiv preprint arXiv:2302.07685 , year=

T. Chen, X. Li, K. Sohn, Z. Wang, L. Yuan, and H. Zhang, “Diffusion-aug: Unlocking the power of diffusion models for visual recognition,”arXiv preprint arXiv:2302.07685, 2023. 3, 6

work page arXiv 2023
[19]

Promptdiffusion: Generating large-scale datasets with subject-prompt alignment,

Y . Wang, H. Tan, and H. Yu, “Promptdiffusion: Generating large-scale datasets with subject-prompt alignment,”arXiv preprint arXiv:2308.07974, 2023. 3, 6

work page arXiv 2023
[20]

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

S. Liu, F. Li, H. Wuet al., “Grounding dino: Marrying dino with grounded pre-training for open-set object detection,” arXiv preprint arXiv:2303.05499, 2023. 3, 6

work page internal anchor Pith review Pith/arXiv arXiv 2023
[21]

High-resolution image synthesis with latent diffusion models,

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Om- mer, “High-resolution image synthesis with latent diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684– 10 695. 3

2022
[22]

Diffusionsat: A generative foun- dation model for satellite imagery. arxiv 2023,

S. Khanna, P. Liu, L. Zhou, C. Meng, R. Rombach, M. Burke, D. Lobell, and S. Ermon, “Diffusionsat: A gener- ative foundation model for satellite imagery,”arXiv preprint arXiv:2312.03606, 2023. 3, 6

work page arXiv 2023
[23]

Geosynth: Contextually-aware high-resolution satellite image synthe- sis,

S. Sastry, S. Khanal, A. Dhakal, and N. Jacobs, “Geosynth: Contextually-aware high-resolution satellite image synthe- sis,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2024, pp. 460–470. 3, 6

2024
[24]

Data augmentation in earth observation: A diffusion model ap- proach,

T. A. DE JESUS SOUSA, B. RIES, and N. GUELFI, “Data augmentation in earth observation: A diffusion model ap- proach,” 2024. 3

2024
[25]

Generating synthetic satellite imagery for rare objects: An empirical comparison of models and met- rics,

T. V . Nguyen, J. Hoster, A. Glaser, K. Hildebrand, and F. Biessmann, “Generating synthetic satellite imagery for rare objects: An empirical comparison of models and met- rics,”arXiv preprint arXiv:2409.01138, 2024. 3

work page arXiv 2024
[26]

Cafomaps: Concentrated animal feed- ing operations in the united states,

Department of Geographical and Sustainability Sciences, University of Iowa, “Cafomaps: Concentrated animal feed- ing operations in the united states,” https://www.cafomaps. org/, 2025, accessed: 2025-05-16. 5, 2

2025
[27]

TorchCAM: Class activation explorer,

F.-G. Fernandez, “TorchCAM: Class activation explorer,” https://github.com/frgfm/torch-cam, March 2020. 5

2020
[28]

High-resolution image synthesis with latent diffu- sion models,

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Om- mer, “High-resolution image synthesis with latent diffu- sion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10 684–10 695. 6

2022
[29]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”CVPR, 2016. 7

2016
[30]

Efficientnet: Rethinking model scaling for convolutional neural networks,

M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,”ICML, 2019. 7

2019
[31]

A convnet for the 2020s,

Z. Liu, H. Mao, C.-Y . Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,”CVPR, 2022. 7

2022
[32]

An im- age is worth 16x16 words: Transformers for image recogni- tion at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An im- age is worth 16x16 words: Transformers for image recogni- tion at scale,”ICLR, 2021. 7

2021
[33]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,”ICCV, 2021. 7

2021
[34]

DINOv2: Learning Robust Visual Features without Supervision

M. Oquab, T. Darcet, T. Moutakanni, A. Ram ´e, D. Haziza, J. Su´arez, M. Szafraniec, Y . Kalantidis, Y . Elkabetz, M. Cord et al., “Dinov2: Learning robust visual features without su- pervision,”arXiv preprint arXiv:2304.07193, 2023. 7

work page internal anchor Pith review Pith/arXiv arXiv 2023
[35]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,”ICML, 2021. 7

2021
[36]

Learning from noisy remote sensing data with vision- language models,

A. Mallya, G. Chen, W. Xie, X. Zhu, D. Lobell, and S. Er- mon, “Learning from noisy remote sensing data with vision- language models,” inNeurIPS, 2022. 7

2022
[37]

National Agricul- ture Imagery Program (NAIP),

U.S. Department of Agriculture, “National Agricul- ture Imagery Program (NAIP),” https : / / www. fsa . usda . gov/programs- and- services/aerial- photography/imagery- programs / naip - imagery/, 2023, https : / / www. fsa . usda . gov/programs- and- services/aerial- photography/imagery- programs/naip-imagery/. 2

2023
[38]

Confined feeding operations - indianamap,

IndianaMap, “Confined feeding operations - indianamap,” https://www.indianamap.org/datasets/INMap::confined- feeding-operations/about, 2024, accessed: 2024-05-15. 2

2024
[39]

Iowa animal feeding operations gis data,

I. G. Portal, “Iowa animal feeding operations gis data,” https : / / geodata . iowa . gov / documents / abfbd972640d4e87b6c48dc669775767 / about, 2024, ac- cessed: 2024-05-15

2024
[40]

Animal feeding operations map,

M. D. of the Environment, “Animal feeding operations map,” https : / / catalog . data . gov / dataset / maryland - department - of- the- environment- lma- resource- management- program- animal-feeding-oper-5fb42, 2024, accessed: 2024-05-15

2024
[41]

Cafo locations - michi- gan department of environment, great lakes, and energy,

Michigan EGLE GIS Hub, “Cafo locations - michi- gan department of environment, great lakes, and energy,” https : / / gis - egle . hub . arcgis . com / datasets / f0843875e5874d04b06396de8200cf75 / explore ? location = 43 . 005451 % 2C - 83 . 963570 % 2C6 . 10, 2024, accessed: 2024-05-15

2024
[42]

Minnesota cafo locations (layer 0),

M. G. Commons, “Minnesota cafo locations (layer 0),” https : / / www . arcgis . com / home / item . html ? id = 6d119156229d4e908e22f027bdaee6be&sublayer=0, 2024, accessed: 2024-05-15

2024
[43]

New york cafo dataset,

New York Department of State, “New york cafo dataset,” https : / / opdgig . dos . ny . gov / datasets / a9a8eaed80864ab98680899ecdbc1c50 / explore ? location = 42 . 664852 % 2C - 76 . 586900 % 2C7 . 22, 2024, accessed: 2024-05-15

2024
[44]

Delaware cafo map viewer,

D. D. of Natural Resources, “Delaware cafo map viewer,” https : / / experience . arcgis . com / experience / c6749f8b31d143cbb38a26fc1b89a2be / page / Delaware, 2024, accessed: 2024-05-15. 2

2024
[45]

National land cover database (nlcd),

M.-R. L. C. C. (MRLC), “National land cover database (nlcd),” 2021, accessed: 2024-03-01. [Online]. Available: https://www.mrlc.gov/ 2 CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery Supplementary Material A. CAFOSat Overview. An overview of the data processing pipeline is provided in Figure 2. The...

2021

[1] [1]

Industrial animal agriculture in the united states: Concentrated animal feeding operations (cafos),

A. Moses and P. Tomaselli, “Industrial animal agriculture in the united states: Concentrated animal feeding operations (cafos),”International farm animal, wildlife and food safety law, pp. 185–214, 2017. 1

2017

[2] [2]

Mapping industrial poultry operations at scale with deep learning and aerial imagery,

C. Robinson, B. Chugg, B. Anderson, J. M. L. Ferres, and D. E. Ho, “Mapping industrial poultry operations at scale with deep learning and aerial imagery,”IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 7458–7471, 2022. 1, 3

2022

[3] [3]

Using an adaptive modeling framework to identify avian in- fluenza spillover risk at the wild-domestic interface,

D. J. Prosser, C. M. Kent, J. D. Sullivan, K. A. Patyk, M.- J. McCool, M. K. Torchetti, K. Lantz, and J. M. Mullinax, “Using an adaptive modeling framework to identify avian in- fluenza spillover risk at the wild-domestic interface,”Scien- tific Reports, vol. 14, no. 1, p. 14199, 2024. 1

2024

[4] [4]

Water- fowl occurrence and residence time as indicators of h5 and h7 avian influenza in north american poultry,

J. M. Humphreys, A. M. Ramey, D. C. Douglas, J. M. Mul- linax, C. Soos, P. Link, P. Walther, and D. J. Prosser, “Water- fowl occurrence and residence time as indicators of h5 and h7 avian influenza in north american poultry,”Scientific Re- ports, vol. 10, no. 1, p. 2592, 2020

2020

[5] [5]

A high- resolution, us-scale digital similar of interacting livestock, wild birds, and human ecosystems with applications to multi- host epidemic spread,

A. Adiga, A. Chopra, M. L. Wilson, S. Ravi, D. Xie, S. Swarup, B. Lewis, R. Raskar, and M. V . Marathe, “A high- resolution, us-scale digital similar of interacting livestock, wild birds, and human ecosystems with applications to multi- host epidemic spread,”arXiv preprint arXiv:2411.01386, 2024

work page arXiv 2024

[6] [6]

Emergence and interstate spread of highly pathogenic avian influenza a (h5n1) in dairy cattle in the united states,

T.-Q. Nguyen, C. R. Hutter, A. Markin, M. Thomas, K. Lantz, M. L. Killian, G. M. Janzen, S. Vijendran, S. Wa- gle, B. Inderskiet al., “Emergence and interstate spread of highly pathogenic avian influenza a (h5n1) in dairy cattle in the united states,”Science, vol. 388, no. 6745, p. eadq0900,

[7] [7]

Gurian-Sherman,CAFOs Uncovered: The Untold Costs of Confined Animal Feeding Operations

D. Gurian-Sherman,CAFOs Uncovered: The Untold Costs of Confined Animal Feeding Operations. Union of Con- cerned Scientists, 2008. 1

2008

[8] [8]

Understanding Concentrated Animal Feeding Operations and Their Impact on Communities,

C. Hribar, “Understanding Concentrated Animal Feeding Operations and Their Impact on Communities,” National As- sociation of Local Boards of Health, Tech. Rep., 2010

2010

[9] [9]

Deep Learning with Satellite Imagery to Enhance Environmental Enforce- ment,

C. Handan-Nader, D. E. Ho, and L. Y . Liu, “Deep Learning with Satellite Imagery to Enhance Environmental Enforce- ment,” inData Science Applied to Sustainability Analysis. Elsevier, 2021, pp. 205–228. 1

2021

[10] [10]

Deep learning to map con- centrated animal feeding operations,

C. Handan-Nader and D. E. Ho, “Deep learning to map con- centrated animal feeding operations,”Nature Sustainability, vol. 2, no. 4, pp. 298–306, 2019. 1, 3, 4, 5, 7, 8

2019

[11] [11]

Meter-ml: a multi-sensor earth observation benchmark for automated methane source mapping,

B. Zhu, N. Lui, J. Irvin, J. Le, S. Tadwalkar, C. Wang, Z. Ouyang, F. Y . Liu, A. Y . Ng, and R. B. Jackson, “Meter-ml: a multi-sensor earth observation benchmark for automated methane source mapping,”arXiv preprint arXiv:2207.11166, 2022. 3, 4, 7, 8

work page arXiv 2022

[12] [12]

Enhancing environmental enforcement with near real-time monitoring: Likelihood-based detection of structural expansion of intensive livestock farms,

B. Chugg, B. Anderson, S. Eicher, S. Lee, and D. E. Ho, “Enhancing environmental enforcement with near real-time monitoring: Likelihood-based detection of structural expansion of intensive livestock farms,”In- ternational Journal of Applied Earth Observation and Geoinformation, vol. 103, p. 102463, 2021. [Online]. Available: https://www.sciencedirect.com/...

2021

[13] [13]

Using machine learning to map concentrated animal feeding operations in new mexico,

V . Ehrenpreis, M. Worsham, N. Clarke, and A. Buch- holz, “Using machine learning to map concentrated animal feeding operations in new mexico,” Mapping for En- vironmental Justice, Tech. Rep., January 2021, report prepared for the McGovern Foundation. [Online]. Avail- able: https://mappingforej.studentorg.berkeley.edu/wp- content/uploads/2022/03/NM-CAFO-R...

2021

[14] [14]

Machine learning-based identification of animal feeding operations in the united states on a parcel-scale,

A. Saha, B. Rashid, T. Liu, L. Miralha, and R. L. Muenich, “Machine learning-based identification of animal feeding operations in the united states on a parcel-scale,”Science of The Total Environment, vol. 960, p. 178312, 2025. 3

2025

[15] [15]

Grad-cam: Visual explanations from deep networks via gradient-based localization,

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” inPro- ceedings of the IEEE International Conference on Computer Vision (ICCV), 2017, pp. 618–626. 3

2017

[16] [16]

Optical remote sensing image understanding with weak supervision: A com- prehensive review,

X. Li, Z. Tang, G.-S. Xia, and L. Zhang, “Optical remote sensing image understanding with weak supervision: A com- prehensive review,”arXiv preprint arXiv:2204.09120, 2022. 3

work page arXiv 2022

[17] [17]

Deep learning in remote sensing: A com- prehensive review and list of resources,

X. Zhu, D. Tuia, L. Mou, G.-S. Xia, L. Zhang, F. Xu, and F. Fraundorfer, “Deep learning in remote sensing: A com- prehensive review and list of resources,”IEEE Geoscience and Remote Sensing Magazine, vol. 5, no. 4, pp. 8–36, 2017. 3, 6

2017

[18] [18]

arXiv preprint arXiv:2302.07685 , year=

T. Chen, X. Li, K. Sohn, Z. Wang, L. Yuan, and H. Zhang, “Diffusion-aug: Unlocking the power of diffusion models for visual recognition,”arXiv preprint arXiv:2302.07685, 2023. 3, 6

work page arXiv 2023

[19] [19]

Promptdiffusion: Generating large-scale datasets with subject-prompt alignment,

Y . Wang, H. Tan, and H. Yu, “Promptdiffusion: Generating large-scale datasets with subject-prompt alignment,”arXiv preprint arXiv:2308.07974, 2023. 3, 6

work page arXiv 2023

[20] [20]

Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection

S. Liu, F. Li, H. Wuet al., “Grounding dino: Marrying dino with grounded pre-training for open-set object detection,” arXiv preprint arXiv:2303.05499, 2023. 3, 6

work page internal anchor Pith review Pith/arXiv arXiv 2023

[21] [21]

High-resolution image synthesis with latent diffusion models,

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Om- mer, “High-resolution image synthesis with latent diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10 684– 10 695. 3

2022

[22] [22]

Diffusionsat: A generative foun- dation model for satellite imagery. arxiv 2023,

S. Khanna, P. Liu, L. Zhou, C. Meng, R. Rombach, M. Burke, D. Lobell, and S. Ermon, “Diffusionsat: A gener- ative foundation model for satellite imagery,”arXiv preprint arXiv:2312.03606, 2023. 3, 6

work page arXiv 2023

[23] [23]

Geosynth: Contextually-aware high-resolution satellite image synthe- sis,

S. Sastry, S. Khanal, A. Dhakal, and N. Jacobs, “Geosynth: Contextually-aware high-resolution satellite image synthe- sis,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2024, pp. 460–470. 3, 6

2024

[24] [24]

Data augmentation in earth observation: A diffusion model ap- proach,

T. A. DE JESUS SOUSA, B. RIES, and N. GUELFI, “Data augmentation in earth observation: A diffusion model ap- proach,” 2024. 3

2024

[25] [25]

Generating synthetic satellite imagery for rare objects: An empirical comparison of models and met- rics,

T. V . Nguyen, J. Hoster, A. Glaser, K. Hildebrand, and F. Biessmann, “Generating synthetic satellite imagery for rare objects: An empirical comparison of models and met- rics,”arXiv preprint arXiv:2409.01138, 2024. 3

work page arXiv 2024

[26] [26]

Cafomaps: Concentrated animal feed- ing operations in the united states,

Department of Geographical and Sustainability Sciences, University of Iowa, “Cafomaps: Concentrated animal feed- ing operations in the united states,” https://www.cafomaps. org/, 2025, accessed: 2025-05-16. 5, 2

2025

[27] [27]

TorchCAM: Class activation explorer,

F.-G. Fernandez, “TorchCAM: Class activation explorer,” https://github.com/frgfm/torch-cam, March 2020. 5

2020

[28] [28]

High-resolution image synthesis with latent diffu- sion models,

R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Om- mer, “High-resolution image synthesis with latent diffu- sion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10 684–10 695. 6

2022

[29] [29]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,”CVPR, 2016. 7

2016

[30] [30]

Efficientnet: Rethinking model scaling for convolutional neural networks,

M. Tan and Q. Le, “Efficientnet: Rethinking model scaling for convolutional neural networks,”ICML, 2019. 7

2019

[31] [31]

A convnet for the 2020s,

Z. Liu, H. Mao, C.-Y . Wu, C. Feichtenhofer, T. Darrell, and S. Xie, “A convnet for the 2020s,”CVPR, 2022. 7

2022

[32] [32]

An im- age is worth 16x16 words: Transformers for image recogni- tion at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An im- age is worth 16x16 words: Transformers for image recogni- tion at scale,”ICLR, 2021. 7

2021

[33] [33]

Swin transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,”ICCV, 2021. 7

2021

[34] [34]

DINOv2: Learning Robust Visual Features without Supervision

M. Oquab, T. Darcet, T. Moutakanni, A. Ram ´e, D. Haziza, J. Su´arez, M. Szafraniec, Y . Kalantidis, Y . Elkabetz, M. Cord et al., “Dinov2: Learning robust visual features without su- pervision,”arXiv preprint arXiv:2304.07193, 2023. 7

work page internal anchor Pith review Pith/arXiv arXiv 2023

[35] [35]

Learning transferable visual models from natural language supervision,

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clarket al., “Learning transferable visual models from natural language supervision,”ICML, 2021. 7

2021

[36] [36]

Learning from noisy remote sensing data with vision- language models,

A. Mallya, G. Chen, W. Xie, X. Zhu, D. Lobell, and S. Er- mon, “Learning from noisy remote sensing data with vision- language models,” inNeurIPS, 2022. 7

2022

[37] [37]

National Agricul- ture Imagery Program (NAIP),

U.S. Department of Agriculture, “National Agricul- ture Imagery Program (NAIP),” https : / / www. fsa . usda . gov/programs- and- services/aerial- photography/imagery- programs / naip - imagery/, 2023, https : / / www. fsa . usda . gov/programs- and- services/aerial- photography/imagery- programs/naip-imagery/. 2

2023

[38] [38]

Confined feeding operations - indianamap,

IndianaMap, “Confined feeding operations - indianamap,” https://www.indianamap.org/datasets/INMap::confined- feeding-operations/about, 2024, accessed: 2024-05-15. 2

2024

[39] [39]

Iowa animal feeding operations gis data,

I. G. Portal, “Iowa animal feeding operations gis data,” https : / / geodata . iowa . gov / documents / abfbd972640d4e87b6c48dc669775767 / about, 2024, ac- cessed: 2024-05-15

2024

[40] [40]

Animal feeding operations map,

M. D. of the Environment, “Animal feeding operations map,” https : / / catalog . data . gov / dataset / maryland - department - of- the- environment- lma- resource- management- program- animal-feeding-oper-5fb42, 2024, accessed: 2024-05-15

2024

[41] [41]

Cafo locations - michi- gan department of environment, great lakes, and energy,

Michigan EGLE GIS Hub, “Cafo locations - michi- gan department of environment, great lakes, and energy,” https : / / gis - egle . hub . arcgis . com / datasets / f0843875e5874d04b06396de8200cf75 / explore ? location = 43 . 005451 % 2C - 83 . 963570 % 2C6 . 10, 2024, accessed: 2024-05-15

2024

[42] [42]

Minnesota cafo locations (layer 0),

M. G. Commons, “Minnesota cafo locations (layer 0),” https : / / www . arcgis . com / home / item . html ? id = 6d119156229d4e908e22f027bdaee6be&sublayer=0, 2024, accessed: 2024-05-15

2024

[43] [43]

New york cafo dataset,

New York Department of State, “New york cafo dataset,” https : / / opdgig . dos . ny . gov / datasets / a9a8eaed80864ab98680899ecdbc1c50 / explore ? location = 42 . 664852 % 2C - 76 . 586900 % 2C7 . 22, 2024, accessed: 2024-05-15

2024

[44] [44]

Delaware cafo map viewer,

D. D. of Natural Resources, “Delaware cafo map viewer,” https : / / experience . arcgis . com / experience / c6749f8b31d143cbb38a26fc1b89a2be / page / Delaware, 2024, accessed: 2024-05-15. 2

2024

[45] [45]

National land cover database (nlcd),

M.-R. L. C. C. (MRLC), “National land cover database (nlcd),” 2021, accessed: 2024-03-01. [Online]. Available: https://www.mrlc.gov/ 2 CAFOSat: A Strongly Annotated Dataset for Infrastructure-Aware CAFO Mapping Using High-Resolution Imagery Supplementary Material A. CAFOSat Overview. An overview of the data processing pipeline is provided in Figure 2. The...

2021