On the Generalizability of Foundation Models for Crop Type Mapping

Adam J. Stewart; Arindam Banerjee; Favyen Bastani; George R. Huber; Jingtong Wang; Piper Wolters; Shreya Kannan; Yi-Chia Chang

arxiv: 2409.09451 · v5 · submitted 2024-09-14 · 💻 cs.CV · cs.LG

On the Generalizability of Foundation Models for Crop Type Mapping

Yi-Chia Chang , Adam J. Stewart , Favyen Bastani , Piper Wolters , Shreya Kannan , George R. Huber , Jingtong Wang , Arindam Banerjee This is my paper

Pith reviewed 2026-05-23 20:56 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords foundation modelscrop type mappingSentinel-2transfer learninggeneralizabilitygeospatial biasself-supervised learningEarth observation

0 comments

The pith

Foundation models pre-trained on Sentinel-2 imagery transfer better to crop classification across continents than ImageNet weights.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether Earth observation foundation models can generalize across geographies for the task of mapping crop types. It compares three pre-trained models on five separate crop classification datasets drawn from five continents. The evaluation shows that models whose pre-training used Sentinel-2 multispectral data achieve higher accuracy than a general-purpose ImageNet model. The work also measures how many labeled examples are needed to reach strong performance and to correct for class imbalance.

Core claim

Pre-trained weights designed explicitly for Sentinel-2, such as SSL4EO-S12, outperform general pre-trained weights like ImageNet when evaluated on five crop classification datasets across five continents. While only 100 labeled images are sufficient for achieving high overall accuracy, 900 images are required to mitigate class imbalance and improve average accuracy.

What carries the argument

Transfer performance comparison of Sentinel-2-specific self-supervised pre-training against ImageNet pre-training when fine-tuned on multispectral crop datasets from multiple continents.

If this is right

Domain-specific pre-training on satellite imagery produces weights that transfer more reliably to new geographic areas than general image pre-training.
Overall accuracy on crop mapping reaches high levels with as few as 100 labeled examples per new region.
Correcting for class imbalance in crop type predictions requires roughly nine times more labeled examples than overall accuracy alone.
Using Sentinel-2 tailored models can reduce the impact of geospatial bias when applying foundation models to agriculture in data-scarce regions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Global crop monitoring systems could rely on a single Sentinel-2 pre-trained backbone with limited local fine-tuning.
Similar pre-training strategies might improve performance on other multispectral tasks such as yield estimation or land-cover change detection.
Practitioners could prioritize collection of balanced labeled samples rather than simply maximizing total volume when adapting these models.

Load-bearing premise

The five crop classification datasets drawn from five continents are representative of global agricultural conditions and free of selection or quality biases.

What would settle it

A new crop classification dataset from an additional continent or region where the ImageNet model matches or exceeds the accuracy of the Sentinel-2 pre-trained models would falsify the claimed advantage.

Figures

Figures reproduced from arXiv: 2409.09451 by Adam J. Stewart, Arindam Banerjee, Favyen Bastani, George R. Huber, Jingtong Wang, Piper Wolters, Shreya Kannan, Yi-Chia Chang.

**Figure 2.** Figure 2: Visualization of example input Sentinel-2 images, [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

read the original abstract

Foundation models pre-trained using self-supervised learning have shown powerful transfer learning capabilities on various downstream tasks, including language understanding, text generation, and image recognition. The Earth observation (EO) field has produced several foundation models pre-trained directly on multispectral satellite imagery for applications like precision agriculture, wildfire and drought monitoring, and natural disaster response. However, few studies have investigated the ability of these models to generalize to new geographic locations, and potential concerns of geospatial bias -- models trained on data-rich developed nations not transferring well to data-scarce developing nations -- remain. We evaluate three popular EO foundation models, SSL4EO-S12, SatlasPretrain, and ImageNet, on five crop classification datasets across five continents. Results show that pre-trained weights designed explicitly for Sentinel-2, such as SSL4EO-S12, outperform general pre-trained weights like ImageNet. While only 100 labeled images are sufficient for achieving high overall accuracy, 900 images are required to mitigate class imbalance and improve average accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows Sentinel-2-specific pretraining beats ImageNet on crop mapping across five continents and gives concrete label thresholds, but the generalizability claim is undercut by missing details on dataset selection and stats.

read the letter

The main takeaway is that pretraining on Sentinel-2 data gives better transfer than ImageNet weights for crop type classification, and 100 labeled images can deliver high overall accuracy while 900 are needed to lift average accuracy across classes. They test this on five datasets spanning five continents, which directly tackles the geospatial bias question that most transfer papers ignore. The concrete numbers on label efficiency are the practical part that stands out. This is mostly an empirical extension of existing models rather than a new technique, but the multi-continent setup and the 100-vs-900 split are results not already in the literature. The work is straightforward and addresses a real deployment concern for data-scarce regions. The soft spots are clear from the abstract alone. There are no error bars, no statistical tests, and no description of dataset sizes, fine-tuning details, or how the five sites were chosen. Without that, it is hard to judge whether the datasets actually cover diverse climates, field sizes, or cloud regimes or whether they were selected mainly for label availability. The stress-test point on selection bias holds up here and weakens the generalizability conclusion. This paper is for researchers working on remote-sensing foundation models or precision agriculture applications. A reader focused on label efficiency or geographic transfer would find the numbers useful even if the methods need more scrutiny. It deserves peer review because the question is timely and the experiments use public data, though revisions will have to add statistical rigor and dataset characterization.

Referee Report

2 major / 1 minor

Summary. The manuscript evaluates three foundation models (SSL4EO-S12, SatlasPretrain, ImageNet) on five crop classification datasets spanning five continents. It claims that Sentinel-2-specific pre-trained weights outperform general ImageNet weights, that 100 labeled images suffice for high overall accuracy, and that 900 images are required to mitigate class imbalance and improve average accuracy.

Significance. If the results hold, the work supplies empirical evidence on the value of domain-specific pretraining for Earth-observation tasks and concrete guidance on label budgets for crop mapping, directly addressing concerns about geospatial bias in data-scarce regions.

major comments (2)

[Abstract] Abstract: the abstract reports comparative accuracies and label thresholds but supplies no error bars, statistical tests, dataset sizes, fine-tuning protocols, or exclusion criteria; without the full methods section it is impossible to verify whether central comparisons are free of post-hoc choices.
[Dataset description] Dataset description (implicit in the five-continent claim): the central claim of geographic generalizability (and thus that SSL4EO-S12 outperforms ImageNet) depends on the five datasets being representative and free of selection or quality bias; no explicit coverage of Köppen climates, crop calendars, field-size distributions, or cloud-cover regimes typical of data-scarce regions is supplied, so the reported 100-vs-900 image thresholds and model ranking could be conditional on untested conditions.

minor comments (1)

The manuscript would benefit from reporting the exact number of images per dataset and per class to allow readers to assess the class-imbalance mitigation claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed review and constructive feedback on our manuscript. We address each major comment below. Where revisions are warranted, we have updated the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: the abstract reports comparative accuracies and label thresholds but supplies no error bars, statistical tests, dataset sizes, fine-tuning protocols, or exclusion criteria; without the full methods section it is impossible to verify whether central comparisons are free of post-hoc choices.

Authors: We agree that the abstract would benefit from additional context on the experimental setup. In the revised manuscript we have expanded the abstract to include the number of datasets and images used, a brief description of the fine-tuning protocol (linear probing with 100/900 labeled examples), and a note that all results include standard error across multiple runs. Full statistical details, exclusion criteria, and protocols remain in the Methods section due to length constraints. We have also added error bars to all result figures. revision: partial
Referee: [Dataset description] Dataset description (implicit in the five-continent claim): the central claim of geographic generalizability (and thus that SSL4EO-S12 outperforms ImageNet) depends on the five datasets being representative and free of selection or quality bias; no explicit coverage of Köppen climates, crop calendars, field-size distributions, or cloud-cover regimes typical of data-scarce regions is supplied, so the reported 100-vs-900 image thresholds and model ranking could be conditional on untested conditions.

Authors: The five datasets were selected because they are the primary public benchmarks for multi-continent crop mapping; together they cover North America, Europe, Africa, Asia, and South America. We acknowledge that the original manuscript did not explicitly tabulate Köppen climates, crop calendars, or cloud regimes. In the revised version we have added a new subsection and supplementary table that summarizes these characteristics for each dataset, drawing on the original dataset papers and auxiliary climate data. This addition allows readers to assess the diversity of conditions tested while preserving the original experimental design. revision: yes

Circularity Check

0 steps flagged

No circularity; purely empirical evaluation on external datasets

full rationale

The paper reports direct experimental results from fine-tuning and evaluating three pre-trained models (SSL4EO-S12, SatlasPretrain, ImageNet) on five public crop classification datasets spanning five continents. No equations, parameter fits, predictions derived from the paper's own data, or self-referential derivations appear in the reported claims. All performance numbers (overall accuracy with 100 images, average accuracy requiring 900 images, model ranking) are measured outcomes on held-out test sets, not constructed from the inputs by definition. The geographic-generalizability claim rests on the external datasets themselves rather than any internal reduction or self-citation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical evaluation paper; contains no mathematical derivations, free parameters, or postulated entities. Relies on standard supervised fine-tuning assumptions common to the field.

pith-pipeline@v0.9.0 · 5732 in / 1163 out tokens · 59701 ms · 2026-05-23T20:56:47.042345+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 2 internal anchors

[1]

Assessing and addressing the global state of food production data scarcity,

E. A. Kebede, H. Abou Ali, T. Clavelle, H. E. Froehlich, J. A. Gephart, S. Hartman, M. Herrero, H. Kerner, P. Mehta, C. Nakalembe et al., “Assessing and addressing the global state of food production data scarcity,” Nature Reviews Earth & Environment, vol. 5, no. 4, pp. 295–311, 2024

work page 2024
[2]

Crop yield prediction using deep neural networks,

S. Khaki and L. Wang, “Crop yield prediction using deep neural networks,” Frontiers in plant science, vol. 10, p. 452963, 2019

work page 2019
[3]

Crop yield assessment from remote sensing,

P. C. Doraiswamy, S. Moulin, P. W. Cook, and A. Stern, “Crop yield assessment from remote sensing,”Photogrammetric Engineering & Remote Sensing , vol. 69, no. 6, pp. 665–674, 2003

work page 2003
[4]

Crop yield prediction using machine learning: A systematic literature re- view,

T. Van Klompenburg, A. Kassahun, and C. Catal, “Crop yield prediction using machine learning: A systematic literature re- view,” Computers and Electronics in Agriculture , vol. 177, p. 105709, 2020

work page 2020
[5]

A GNN-RNN approach for harnessing geospatial and temporal information: Application to crop yield prediction,

J. Fan, J. Bai, Z. Li, A. Ortiz-Bobea, and C. P. Gomes, “A GNN-RNN approach for harnessing geospatial and temporal information: Application to crop yield prediction,” Proceedings of the AAAI Conference on Artificial Intelligence , vol. 36, no. 11, pp. 11 873–11 881, Jun. 2022. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/21444

work page 2022
[6]

Seasonality, crop type and crop phenology influence crop damage by wildlife herbivores in africa and asia,

E. M. Gross, B. P. Lahkar, N. Subedi, V . R. Nyirenda, L. L. Lichtenfeld, and O. Jakoby, “Seasonality, crop type and crop phenology influence crop damage by wildlife herbivores in africa and asia,” Biodiversity and Conservation , vol. 27, pp. 2029–2050, 2018

work page 2029
[7]

Wildlife- friendly farming increases crop yield: evidence for ecological intensification,

R. F. Pywell, M. S. Heard, B. A. Woodcock, S. Hinsley, L. Ridding, M. Nowakowski, and J. M. Bullock, “Wildlife- friendly farming increases crop yield: evidence for ecological intensification,” Proceedings of the Royal Society B: Biological Sciences, vol. 282, no. 1816, p. 20151740, 2015

work page 2015
[8]

Assessment of crop damage using space remote sensing and gis,

N. Silleos, K. Perakis, and G. Petsanis, “Assessment of crop damage using space remote sensing and gis,” International Journal of Remote Sensing , vol. 23, no. 3, pp. 417–427, 2002

work page 2002
[9]

A systematic review on case studies of remote-sensing-based flood crop loss assessment,

M. S. Rahman and L. Di, “A systematic review on case studies of remote-sensing-based flood crop loss assessment,” Agriculture, vol. 10, no. 4, p. 131, 2020

work page 2020
[10]

Accuracy assessment of the first Eu-wide crop type map with lucas data,

A. Verhegghen, R. d’Andrimont, F. Waldner, and M. Van der Velde, “Accuracy assessment of the first Eu-wide crop type map with lucas data,” in 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS , 2021, pp. 1990–1993

work page 2021
[11]

Assessment and potential of the 2007 usda-nass cropland data layer for statewide annual land cover applications,

D. Luman and T. Tweddale, “Assessment and potential of the 2007 usda-nass cropland data layer for statewide annual land cover applications,” Technical Report INHS 2008 (49) , 2008

work page 2007
[12]

Crop type mapping using lidar, sentinel-2 and aerial imagery with machine learning algorithms,

A. J. Prins and A. Van Niekerk, “Crop type mapping using lidar, sentinel-2 and aerial imagery with machine learning algorithms,” Geo-Spatial Information Science , vol. 24, no. 2, pp. 215–227, 2021

work page 2021
[13]

Sentinel sar-optical fusion for crop type mapping using deep learning and google earth engine,

J. Adrian, V . Sagan, and M. Maimaitijiang, “Sentinel sar-optical fusion for crop type mapping using deep learning and google earth engine,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 175, pp. 215–235, 2021

work page 2021
[14]

Deep learning with multi-scale temporal hybrid structure for robust crop mapping,

P. Tang, J. Chanussot, S. Guo, W. Zhang, L. Qie, P. Zhang, H. Fang, and P. Du, “Deep learning with multi-scale temporal hybrid structure for robust crop mapping,” ISPRS Journal of Photogrammetry and Remote Sensing , vol. 209, pp. 117–132, 2024

work page 2024
[15]

Bridging optical and sar satellite image time series via contrastive feature extraction for crop classification,

Y . Yuan, L. Lin, Z.-G. Zhou, H. Jiang, and Q. Liu, “Bridging optical and sar satellite image time series via contrastive feature extraction for crop classification,”ISPRS Journal of Photogram- metry and Remote Sensing , vol. 195, pp. 222–232, 2023

work page 2023
[16]

Transfer learning in environmental remote sensing,

Y . Ma, S. Chen, S. Ermon, and D. B. Lobell, “Transfer learning in environmental remote sensing,” Remote Sensing of Environment , vol. 301, p. 113924, 2024. [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S0034425723004765

work page 2024
[17]

Generalized few-shot semantic segmentation in remote sensing: Challenge and benchmark,

C. Broni-Bediako, J. Xia, J. Song, H. Chen, M. Siam, and N. Yokoya, “Generalized few-shot semantic segmentation in remote sensing: Challenge and benchmark,” IEEE Geoscience and Remote Sensing Letters , vol. 21, pp. 1–5, 2024

work page 2024
[18]

Towards global crop maps with transfer learning,

A. Koukos, H.-W. Jo, V . Sitokonstantinou, I. Tsoumas, C. Kon- toes, and W.-K. Lee, “Towards global crop maps with transfer learning,” in IGARSS 2024 - 2024 IEEE International Geo- science and Remote Sensing Symposium , 2024, pp. 1540–1545

work page 2024
[19]

Fields of The World: A machine learning benchmark dataset for global agricultural field boundary segmentation,

H. Kerner, S. Chaudhari, A. Ghosh, C. Robinson, A. Ahmad, E. Choi, N. Jacobs, C. Holmes, M. Mohr, R. Dodhia, J. M. L. Ferres, and J. Marcus, “Fields of The World: A machine learning benchmark dataset for global agricultural field boundary segmentation,” 2024. [Online]. Available: https://arxiv.org/abs/2409.16252

work page arXiv 2024
[20]

A distribution shift benchmark for smallholder agroforestry: Do foundation models improve geographic generalization?

S. Sachdeva, I. Lopez, C. Biradar, and D. Lobell, “A distribution shift benchmark for smallholder agroforestry: Do foundation models improve geographic generalization?” The Twelfth In- ternational Conference on Learning Representations 2024 Ma- chine Learning for Remote Sensing (ML4RS) Workshop , 2024

work page 2024
[21]

Lightweight, pre-trained transformers for remote sensing timeseries,

G. Tseng, R. Cartuyvels, I. Zvonkov, M. Purohit, D. Rolnick, and H. Kerner, “Lightweight, pre-trained transformers for remote sensing timeseries,” 2024. [Online]. Available: https: //arxiv.org/abs/2304.14065

work page arXiv 2024
[22]

CropHarvest: A global dataset for crop-type classification,

G. Tseng, I. Zvonkov, C. Nakalembe, and H. R. Kerner, “CropHarvest: A global dataset for crop-type classification,” in NeurIPS Datasets and Benchmarks , 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:248529758

work page 2021
[23]

Fewshot learning on global multimodal embeddings for earth observation tasks,

M. Allen, F. Dorr, J. A. Gallego-Mejia, L. Mart ´ınez- Ferrer, A. Jungbluth, F. Kalaitzis, and R. Ramos-Poll ´an, “Fewshot learning on global multimodal embeddings for earth observation tasks,” 2023. [Online]. Available: https: //arxiv.org/abs/2310.00119

work page arXiv 2023
[24]

SSL4EO-S12: A large-scale multimodal, multitemporal dataset for self-supervised learning in earth observation [software and data sets],

Y . Wang, N. A. A. Braham, Z. Xiong, C. Liu, C. M. Albrecht, and X. Zhu, “SSL4EO-S12: A large-scale multimodal, multitemporal dataset for self-supervised learning in earth observation [software and data sets],” IEEE Geoscience and Remote Sensing Magazine, vol. 11, pp. 98–106, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:262975520

work page 2023
[25]

SatlasPretrain: A large-scale dataset for remote sensing image understanding,

F. Bastani, P. Wolters, R. Gupta, J. Ferdinando, and A. Kembhavi, “SatlasPretrain: A large-scale dataset for remote sensing image understanding,” 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16 726–16 736, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:258947021

work page 2023
[26]

Cropland Data Layer,

USDA NASS, “Cropland Data Layer,” USDA NASS Marketing and Information Services Office, Washington, D.C., 2024. [Online]. Available: https://croplandcros.scinet.usda.gov/

work page 2024
[27]

EuroCrops: The largest harmonized open crop dataset across the European Union,

M. Schneider, T. Schelte, F. Schmitz, and M. K ¨orner, “EuroCrops: The largest harmonized open crop dataset across the European Union,” Scientific Data , vol. 10, no. 1, p. 612, Sep. 2023. [Online]. Available: https: //doi.org/10.1038/s41597-023-02517-0

work page doi:10.1038/s41597-023-02517-0 2023
[28]

The 10-m crop type maps in northeast china during 2017–2019,

N. You, J. Dong, J. Huang, G. Du, G. Zhang, Y . He, T. Yang, Y . Di, and X. Xiao, “The 10-m crop type maps in northeast china during 2017–2019,” Scientific data, vol. 8, no. 1, p. 41, 2021. [Online]. Available: https://doi.org/10.1038/s41597-021-00827-9

work page doi:10.1038/s41597-021-00827-9 2017
[29]

Crop type classification dataset for western cape, south africa,

Western Cape Department of Agriculture and Radiant Earth Foundation, “Crop type classification dataset for western cape, south africa,” Radiant MLHub, 2021, version 1.0. [Online]. Available: https://doi.org/10.34911/rdnt.j0co8q

work page doi:10.34911/rdnt.j0co8q 2021
[30]

Massive soybean expansion in south america since 2000 and implications for conservation,

X.-P. Song, M. C. Hansen, P. Potapov, B. Adusei, J. Pickering, M. Adami, A. Lima, V . Zalles, S. V . Stehman, C. M. Di Bella et al., “Massive soybean expansion in south america since 2000 and implications for conservation,” Nature sustainability, vol. 4, no. 9, pp. 784–792, 2021

work page 2000
[31]

TorchGeo: Deep learning with geospatial data,

A. J. Stewart, C. Robinson, I. A. Corley, A. Ortiz, J. M. Lavista Ferres, and A. Banerjee, “TorchGeo: Deep learning with geospatial data,” in Proceedings of the 30th International Conference on Advances in Geographic Information Systems , ser. SIGSPATIAL ’22. Seattle, Washington: Association for Computing Machinery, Nov. 2022, pp. 1–12. [Online]. Availabl...

work page doi:10.1145/3557915.3560953 2022
[32]

Regional and global shifts in crop diversity through the anthropocene,

A. R. Martin, M. W. Cadotte, M. E. Isaac, R. Milla, D. Vile, and C. Violle, “Regional and global shifts in crop diversity through the anthropocene,” PLoS One, vol. 14, no. 2, p. e0209788, 2019

work page 2019
[33]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , June 2016

work page 2016
[34]

U-Net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Medical Im- age Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, Eds. Cham: Springer International Publishing, 2015, pp. 234– 241

work page 2015
[35]

An image is worth 16x16 words: Transformers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,”

work page
[36]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

[Online]. Available: https://arxiv.org/abs/2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2010
[37]

Improved Baselines with Momentum Contrastive Learning

X. Chen, H. Fan, R. Girshick, and K. He, “Improved base- lines with momentum contrastive learning,” arXiv preprint arXiv:2003.04297, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2003
[38]

Emerging properties in self-supervised vision transformers,

M. Caron, H. Touvron, I. Misra, H. J ´egou, J. Mairal, P. Bo- janowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in Proceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV) , October 2021, pp. 9650–9660

work page 2021
[39]

Swin Transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical vision transformer using shifted windows,” in 2021 IEEE/CVF International Con- ference on Computer Vision (ICCV) , 2021, pp. 9992–10 002

work page 2021
[40]

ImageNet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255

work page 2009
[41]

PyTorch Image Models,

R. Wightman, “PyTorch Image Models,” https://github.com/ rwightman/pytorch-image-models, 2019

work page 2019
[42]

SSL4EO-L: Datasets and foundation models for Landsat imagery,

A. Stewart, N. Lehmann, I. Corley, Y . Wang, Y .-C. Chang, N. Ait Ali Braham, S. Sehgal, C. Robinson, and A. Baner- jee, “SSL4EO-L: Datasets and foundation models for Landsat imagery,” Advances in Neural Information Processing Systems , vol. 36, 2024

work page 2024

[1] [1]

Assessing and addressing the global state of food production data scarcity,

E. A. Kebede, H. Abou Ali, T. Clavelle, H. E. Froehlich, J. A. Gephart, S. Hartman, M. Herrero, H. Kerner, P. Mehta, C. Nakalembe et al., “Assessing and addressing the global state of food production data scarcity,” Nature Reviews Earth & Environment, vol. 5, no. 4, pp. 295–311, 2024

work page 2024

[2] [2]

Crop yield prediction using deep neural networks,

S. Khaki and L. Wang, “Crop yield prediction using deep neural networks,” Frontiers in plant science, vol. 10, p. 452963, 2019

work page 2019

[3] [3]

Crop yield assessment from remote sensing,

P. C. Doraiswamy, S. Moulin, P. W. Cook, and A. Stern, “Crop yield assessment from remote sensing,”Photogrammetric Engineering & Remote Sensing , vol. 69, no. 6, pp. 665–674, 2003

work page 2003

[4] [4]

Crop yield prediction using machine learning: A systematic literature re- view,

T. Van Klompenburg, A. Kassahun, and C. Catal, “Crop yield prediction using machine learning: A systematic literature re- view,” Computers and Electronics in Agriculture , vol. 177, p. 105709, 2020

work page 2020

[5] [5]

A GNN-RNN approach for harnessing geospatial and temporal information: Application to crop yield prediction,

J. Fan, J. Bai, Z. Li, A. Ortiz-Bobea, and C. P. Gomes, “A GNN-RNN approach for harnessing geospatial and temporal information: Application to crop yield prediction,” Proceedings of the AAAI Conference on Artificial Intelligence , vol. 36, no. 11, pp. 11 873–11 881, Jun. 2022. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/21444

work page 2022

[6] [6]

Seasonality, crop type and crop phenology influence crop damage by wildlife herbivores in africa and asia,

E. M. Gross, B. P. Lahkar, N. Subedi, V . R. Nyirenda, L. L. Lichtenfeld, and O. Jakoby, “Seasonality, crop type and crop phenology influence crop damage by wildlife herbivores in africa and asia,” Biodiversity and Conservation , vol. 27, pp. 2029–2050, 2018

work page 2029

[7] [7]

Wildlife- friendly farming increases crop yield: evidence for ecological intensification,

R. F. Pywell, M. S. Heard, B. A. Woodcock, S. Hinsley, L. Ridding, M. Nowakowski, and J. M. Bullock, “Wildlife- friendly farming increases crop yield: evidence for ecological intensification,” Proceedings of the Royal Society B: Biological Sciences, vol. 282, no. 1816, p. 20151740, 2015

work page 2015

[8] [8]

Assessment of crop damage using space remote sensing and gis,

N. Silleos, K. Perakis, and G. Petsanis, “Assessment of crop damage using space remote sensing and gis,” International Journal of Remote Sensing , vol. 23, no. 3, pp. 417–427, 2002

work page 2002

[9] [9]

A systematic review on case studies of remote-sensing-based flood crop loss assessment,

M. S. Rahman and L. Di, “A systematic review on case studies of remote-sensing-based flood crop loss assessment,” Agriculture, vol. 10, no. 4, p. 131, 2020

work page 2020

[10] [10]

Accuracy assessment of the first Eu-wide crop type map with lucas data,

A. Verhegghen, R. d’Andrimont, F. Waldner, and M. Van der Velde, “Accuracy assessment of the first Eu-wide crop type map with lucas data,” in 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS , 2021, pp. 1990–1993

work page 2021

[11] [11]

Assessment and potential of the 2007 usda-nass cropland data layer for statewide annual land cover applications,

D. Luman and T. Tweddale, “Assessment and potential of the 2007 usda-nass cropland data layer for statewide annual land cover applications,” Technical Report INHS 2008 (49) , 2008

work page 2007

[12] [12]

Crop type mapping using lidar, sentinel-2 and aerial imagery with machine learning algorithms,

A. J. Prins and A. Van Niekerk, “Crop type mapping using lidar, sentinel-2 and aerial imagery with machine learning algorithms,” Geo-Spatial Information Science , vol. 24, no. 2, pp. 215–227, 2021

work page 2021

[13] [13]

Sentinel sar-optical fusion for crop type mapping using deep learning and google earth engine,

J. Adrian, V . Sagan, and M. Maimaitijiang, “Sentinel sar-optical fusion for crop type mapping using deep learning and google earth engine,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 175, pp. 215–235, 2021

work page 2021

[14] [14]

Deep learning with multi-scale temporal hybrid structure for robust crop mapping,

P. Tang, J. Chanussot, S. Guo, W. Zhang, L. Qie, P. Zhang, H. Fang, and P. Du, “Deep learning with multi-scale temporal hybrid structure for robust crop mapping,” ISPRS Journal of Photogrammetry and Remote Sensing , vol. 209, pp. 117–132, 2024

work page 2024

[15] [15]

Bridging optical and sar satellite image time series via contrastive feature extraction for crop classification,

Y . Yuan, L. Lin, Z.-G. Zhou, H. Jiang, and Q. Liu, “Bridging optical and sar satellite image time series via contrastive feature extraction for crop classification,”ISPRS Journal of Photogram- metry and Remote Sensing , vol. 195, pp. 222–232, 2023

work page 2023

[16] [16]

Transfer learning in environmental remote sensing,

Y . Ma, S. Chen, S. Ermon, and D. B. Lobell, “Transfer learning in environmental remote sensing,” Remote Sensing of Environment , vol. 301, p. 113924, 2024. [Online]. Available: https://www.sciencedirect.com/science/ article/pii/S0034425723004765

work page 2024

[17] [17]

Generalized few-shot semantic segmentation in remote sensing: Challenge and benchmark,

C. Broni-Bediako, J. Xia, J. Song, H. Chen, M. Siam, and N. Yokoya, “Generalized few-shot semantic segmentation in remote sensing: Challenge and benchmark,” IEEE Geoscience and Remote Sensing Letters , vol. 21, pp. 1–5, 2024

work page 2024

[18] [18]

Towards global crop maps with transfer learning,

A. Koukos, H.-W. Jo, V . Sitokonstantinou, I. Tsoumas, C. Kon- toes, and W.-K. Lee, “Towards global crop maps with transfer learning,” in IGARSS 2024 - 2024 IEEE International Geo- science and Remote Sensing Symposium , 2024, pp. 1540–1545

work page 2024

[19] [19]

Fields of The World: A machine learning benchmark dataset for global agricultural field boundary segmentation,

H. Kerner, S. Chaudhari, A. Ghosh, C. Robinson, A. Ahmad, E. Choi, N. Jacobs, C. Holmes, M. Mohr, R. Dodhia, J. M. L. Ferres, and J. Marcus, “Fields of The World: A machine learning benchmark dataset for global agricultural field boundary segmentation,” 2024. [Online]. Available: https://arxiv.org/abs/2409.16252

work page arXiv 2024

[20] [20]

A distribution shift benchmark for smallholder agroforestry: Do foundation models improve geographic generalization?

S. Sachdeva, I. Lopez, C. Biradar, and D. Lobell, “A distribution shift benchmark for smallholder agroforestry: Do foundation models improve geographic generalization?” The Twelfth In- ternational Conference on Learning Representations 2024 Ma- chine Learning for Remote Sensing (ML4RS) Workshop , 2024

work page 2024

[21] [21]

Lightweight, pre-trained transformers for remote sensing timeseries,

G. Tseng, R. Cartuyvels, I. Zvonkov, M. Purohit, D. Rolnick, and H. Kerner, “Lightweight, pre-trained transformers for remote sensing timeseries,” 2024. [Online]. Available: https: //arxiv.org/abs/2304.14065

work page arXiv 2024

[22] [22]

CropHarvest: A global dataset for crop-type classification,

G. Tseng, I. Zvonkov, C. Nakalembe, and H. R. Kerner, “CropHarvest: A global dataset for crop-type classification,” in NeurIPS Datasets and Benchmarks , 2021. [Online]. Available: https://api.semanticscholar.org/CorpusID:248529758

work page 2021

[23] [23]

Fewshot learning on global multimodal embeddings for earth observation tasks,

M. Allen, F. Dorr, J. A. Gallego-Mejia, L. Mart ´ınez- Ferrer, A. Jungbluth, F. Kalaitzis, and R. Ramos-Poll ´an, “Fewshot learning on global multimodal embeddings for earth observation tasks,” 2023. [Online]. Available: https: //arxiv.org/abs/2310.00119

work page arXiv 2023

[24] [24]

SSL4EO-S12: A large-scale multimodal, multitemporal dataset for self-supervised learning in earth observation [software and data sets],

Y . Wang, N. A. A. Braham, Z. Xiong, C. Liu, C. M. Albrecht, and X. Zhu, “SSL4EO-S12: A large-scale multimodal, multitemporal dataset for self-supervised learning in earth observation [software and data sets],” IEEE Geoscience and Remote Sensing Magazine, vol. 11, pp. 98–106, 2023. [Online]. Available: https://api.semanticscholar.org/CorpusID:262975520

work page 2023

[25] [25]

SatlasPretrain: A large-scale dataset for remote sensing image understanding,

F. Bastani, P. Wolters, R. Gupta, J. Ferdinando, and A. Kembhavi, “SatlasPretrain: A large-scale dataset for remote sensing image understanding,” 2023 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 16 726–16 736, 2022. [Online]. Available: https://api.semanticscholar.org/CorpusID:258947021

work page 2023

[26] [26]

Cropland Data Layer,

USDA NASS, “Cropland Data Layer,” USDA NASS Marketing and Information Services Office, Washington, D.C., 2024. [Online]. Available: https://croplandcros.scinet.usda.gov/

work page 2024

[27] [27]

EuroCrops: The largest harmonized open crop dataset across the European Union,

M. Schneider, T. Schelte, F. Schmitz, and M. K ¨orner, “EuroCrops: The largest harmonized open crop dataset across the European Union,” Scientific Data , vol. 10, no. 1, p. 612, Sep. 2023. [Online]. Available: https: //doi.org/10.1038/s41597-023-02517-0

work page doi:10.1038/s41597-023-02517-0 2023

[28] [28]

The 10-m crop type maps in northeast china during 2017–2019,

N. You, J. Dong, J. Huang, G. Du, G. Zhang, Y . He, T. Yang, Y . Di, and X. Xiao, “The 10-m crop type maps in northeast china during 2017–2019,” Scientific data, vol. 8, no. 1, p. 41, 2021. [Online]. Available: https://doi.org/10.1038/s41597-021-00827-9

work page doi:10.1038/s41597-021-00827-9 2017

[29] [29]

Crop type classification dataset for western cape, south africa,

Western Cape Department of Agriculture and Radiant Earth Foundation, “Crop type classification dataset for western cape, south africa,” Radiant MLHub, 2021, version 1.0. [Online]. Available: https://doi.org/10.34911/rdnt.j0co8q

work page doi:10.34911/rdnt.j0co8q 2021

[30] [30]

Massive soybean expansion in south america since 2000 and implications for conservation,

X.-P. Song, M. C. Hansen, P. Potapov, B. Adusei, J. Pickering, M. Adami, A. Lima, V . Zalles, S. V . Stehman, C. M. Di Bella et al., “Massive soybean expansion in south america since 2000 and implications for conservation,” Nature sustainability, vol. 4, no. 9, pp. 784–792, 2021

work page 2000

[31] [31]

TorchGeo: Deep learning with geospatial data,

A. J. Stewart, C. Robinson, I. A. Corley, A. Ortiz, J. M. Lavista Ferres, and A. Banerjee, “TorchGeo: Deep learning with geospatial data,” in Proceedings of the 30th International Conference on Advances in Geographic Information Systems , ser. SIGSPATIAL ’22. Seattle, Washington: Association for Computing Machinery, Nov. 2022, pp. 1–12. [Online]. Availabl...

work page doi:10.1145/3557915.3560953 2022

[32] [32]

Regional and global shifts in crop diversity through the anthropocene,

A. R. Martin, M. W. Cadotte, M. E. Isaac, R. Milla, D. Vile, and C. Violle, “Regional and global shifts in crop diversity through the anthropocene,” PLoS One, vol. 14, no. 2, p. e0209788, 2019

work page 2019

[33] [33]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , June 2016

work page 2016

[34] [34]

U-Net: Convolutional networks for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Medical Im- age Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, Eds. Cham: Springer International Publishing, 2015, pp. 234– 241

work page 2015

[35] [35]

An image is worth 16x16 words: Transformers for image recognition at scale,

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,”

work page

[36] [36]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

[Online]. Available: https://arxiv.org/abs/2010.11929

work page internal anchor Pith review Pith/arXiv arXiv 2010

[37] [37]

Improved Baselines with Momentum Contrastive Learning

X. Chen, H. Fan, R. Girshick, and K. He, “Improved base- lines with momentum contrastive learning,” arXiv preprint arXiv:2003.04297, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2003

[38] [38]

Emerging properties in self-supervised vision transformers,

M. Caron, H. Touvron, I. Misra, H. J ´egou, J. Mairal, P. Bo- janowski, and A. Joulin, “Emerging properties in self-supervised vision transformers,” in Proceedings of the IEEE/CVF Interna- tional Conference on Computer Vision (ICCV) , October 2021, pp. 9650–9660

work page 2021

[39] [39]

Swin Transformer: Hierarchical vision transformer using shifted windows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical vision transformer using shifted windows,” in 2021 IEEE/CVF International Con- ference on Computer Vision (ICCV) , 2021, pp. 9992–10 002

work page 2021

[40] [40]

ImageNet: A large-scale hierarchical image database,

J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255

work page 2009

[41] [41]

PyTorch Image Models,

R. Wightman, “PyTorch Image Models,” https://github.com/ rwightman/pytorch-image-models, 2019

work page 2019

[42] [42]

SSL4EO-L: Datasets and foundation models for Landsat imagery,

A. Stewart, N. Lehmann, I. Corley, Y . Wang, Y .-C. Chang, N. Ait Ali Braham, S. Sehgal, C. Robinson, and A. Baner- jee, “SSL4EO-L: Datasets and foundation models for Landsat imagery,” Advances in Neural Information Processing Systems , vol. 36, 2024

work page 2024