BELDE: Building a Large-scale Earth-observation Land-cover Dataset for Europe

Alptekin Temizel; \"Umit Mert \c{C}a\u{g}lar

arxiv: 2606.20909 · v1 · pith:PEOZTVVSnew · submitted 2026-06-18 · 💻 cs.CV · cs.AI· eess.IV

BELDE: Building a Large-scale Earth-observation Land-cover Dataset for Europe

\"Umit Mert \c{C}a\u{g}lar , Alptekin Temizel This is my paper

Pith reviewed 2026-06-26 17:47 UTC · model grok-4.3

classification 💻 cs.CV cs.AIeess.IV

keywords land-cover segmentationSentinel-2Earth observationsemantic segmentationremote sensing datasetEuropeRGB imageryWorldCover

0 comments

The pith

BELDE supplies 1,088,385 curated Sentinel-2 RGB and land-cover map pairs across Europe at 10 m resolution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs and releases BELDE to address the shortage of large-scale, publicly accessible RGB datasets for semantic segmentation in Earth observation. It pairs Sentinel-2 true-color imagery with ESA WorldCover labels for seven land-cover classes over the full European continent. Baseline models trained on the data reach 83 percent F1 on a held-out European test set but fall to 66.4 percent and 58.3 percent on out-of-distribution sets from the United States and Korea, respectively. The work also supplies smaller companion sets for Korea and California-Nevada to support studies of geographic domain shift.

Core claim

BELDE contains 1,088,385 curated image-segmentation map pairs spanning Europe with 7 land-cover classes at 10 m spatial resolution, constructed from Sentinel-2 true-color images and ESA WorldCover data annotations; the authors additionally release BELDE-K (16,607 pairs) for Korea and BELDE-CA-NV (88,155 pairs) for California and Nevada, and report baseline F1 scores of 83.0 percent in-domain versus 66.4 percent and 58.3 percent on the cross-domain sets.

What carries the argument

The BELDE dataset of paired Sentinel-2 RGB images and seven-class land-cover segmentation maps derived from ESA WorldCover annotations, which supplies the scale and geographic breadth needed for training and evaluating remote-sensing segmentation models.

If this is right

Models trained on BELDE reach 83.0 percent F1 on the European test set.
The same models drop to 66.4 percent F1 on BELDE-CA-NV and 58.3 percent on BELDE-K, quantifying geographic domain shift.
The dataset and its cross-region companions enable controlled studies of model generalization beyond a single continent.
Public release of the full collection lowers the barrier to developing transferable Earth-observation segmentation systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the WorldCover labels prove consistent enough, future work could combine BELDE with similar continental-scale sets to train models that require less region-specific retraining.
The measured cross-domain drops suggest that explicit domain-adaptation layers or style-transfer preprocessing may be needed before models trained on BELDE can be deployed reliably outside Europe.
The dataset size makes it feasible to test whether scaling laws observed in natural-image segmentation also hold for satellite RGB data.

Load-bearing premise

ESA WorldCover annotations supply sufficiently accurate and spatially consistent labels to serve as reliable ground truth across the full diversity of European landscapes.

What would settle it

An independent high-resolution ground-truth survey in multiple European regions that reveals systematic label errors in one or more of the seven WorldCover classes at a scale large enough to change reported F1 scores by more than a few points.

Figures

Figures reproduced from arXiv: 2606.20909 by Alptekin Temizel, \"Umit Mert \c{C}a\u{g}lar.

**Figure 1.** Figure 1: Workflow of the BELDE dataset generation. Automated spatial querying downloads co-registered Sentinel-2 true-color tiles and ESA WorldCover 2021 maps. The pipeline applies rule-based taxonomic remapping, strict no-data filtering, and spatial patch slicing to construct over one million curated 256 × 256 pixel image-mask pairs. To address these limitations, we introduce BELDE (Building a Large-scale Earth-ob… view at source ↗

**Figure 2.** Figure 2: The data acquisition area of the BELDE, BELDE-K and BELDE-CA-NV corresponding to Europe, Republic of Korea and California-Nevada respectively. BELDE is constructed through a fully automated and reproducible pipeline that generates co-registered image-mask pairs for semantic segmentation from large-scale Earth observation data ( [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Class composition violin plots for BELDE (solid), BELDE-CA-NV (forward slash hatched) and BELDE-K (backward slash hatched) in order and 33◦N-60◦N latitude ( [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: The experimental setup that tests segmentation models trained with BELDE on its extensions, BELDE-K and BELDE-CA-NV. substantially different land-cover distributions compared to the European training domain, enabling systematic evaluation of cross-region generalization in RGB-only Earth observation. 5.1 Effect of Geographic Domain Shift Tables 4 and 5 summarize model performance when trained on BELDE (Euro… view at source ↗

read the original abstract

Earth observation imagery plays a critical role in environmental monitoring, urban planning, disaster assessment, and climate analysis. While multi-spectral sensors are increasingly available, true-color (RGB) imagery remains widely used due to the power, cost, and deployment constraints of many satellite and aerial platforms. However, existing land-cover segmentation datasets are often limited in geographic coverage, scale, or public accessibility. To bridge this gap, we introduce BELDE (Building a Large-scale Earth-observation Land-cover Dataset for Europe), a publicly available dataset tailored for RGB-based remote sensing semantic segmentation. Constructed from Sentinel-2 true-color images and ESA WorldCover data annotations, BELDE contains 1,088,385 curated image-segmentation map pairs spanning Europe with 7 land-cover classes at 10 m spatial resolution, making it one of the largest publicly available RGB land-cover segmentation datasets for Earth observation. To facilitate cross-region generalization studies, we additionally introduce BELDE-K (16,607 pairs) covering the Republic of Korea and BELDE-CA-NV (88,155 pairs) covering California and Nevada in the United States. We establish baseline results using multiple semantic segmentation architectures and evaluate both in-domain and cross-domain performance. Models trained on BELDE achieve an F1 score of 83.0% on the European test set, while performance decreases to 66.4% on BELDE-CA-NV and 58.3% on BELDE-K, highlighting the challenges posed by out-of-distribution geographic domain shift. By providing a continental-scale RGB segmentation and evaluation benchmark, BELDE supports the development of robust and transferable Earth observation models. The dataset and benchmark resources will be publicly released.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BELDE releases a large European RGB land-cover dataset with cross-domain test sets, but its value hinges on unverified WorldCover labels.

read the letter

BELDE is a dataset paper that assembles over a million Sentinel-2 RGB patches paired with 7-class land-cover maps across Europe at 10 m resolution, plus two smaller out-of-domain sets from Korea and California-Nevada.

What stands out is the scale combined with the explicit cross-region tests. Most prior land-cover collections either stay smaller, cover narrower geographies, or rely on more than RGB bands. Releasing the data publicly and running baselines on several segmentation models gives a concrete starting point for work on geographic generalization.

The main limitation is the labels. Everything rests on ESA WorldCover annotations with no reported accuracy checks, confusion matrices against national maps, or manual audits on a subset. WorldCover has documented errors in heterogeneous or transitional areas, so the 83 % in-domain F1 and the drops to 58-66 % out of domain are difficult to read as pure model performance. Curation rules and exact train-test construction are also not detailed enough to reproduce or audit.

This is useful for remote-sensing and CV groups that need large RGB segmentation benchmarks. It deserves peer review because the scale and the cross-domain splits are concrete additions, even if referees will need to see label-quality evidence before the numbers can be taken at face value.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces BELDE, a publicly available dataset of 1,088,385 curated Sentinel-2 RGB image and ESA WorldCover segmentation map pairs covering Europe at 10 m resolution with 7 land-cover classes. It also releases smaller cross-domain sets BELDE-K (16,607 pairs, Republic of Korea) and BELDE-CA-NV (88,155 pairs, California/Nevada). Baseline semantic segmentation models achieve 83.0% F1 on the European test set, dropping to 66.4% and 58.3% on the out-of-domain sets, with plans for public release to support generalization research in Earth observation.

Significance. If the WorldCover-derived labels can be shown to be sufficiently accurate, BELDE would constitute a meaningful contribution by supplying one of the largest public RGB land-cover segmentation resources at continental scale, together with explicit cross-domain benchmarks that directly address a recognized challenge in remote-sensing model transfer. The scale and planned public release are strengths that would facilitate reproducible work on domain shift.

major comments (2)

[Abstract] Abstract: the central claim that BELDE supplies reliable large-scale training/evaluation data rests on ESA WorldCover supplying accurate 7-class labels, yet the manuscript reports no per-class accuracy figures, confusion matrices versus CORINE or national maps, or manual audit on any validation subset; without this, the reported 83.0% in-domain F1 and the cross-domain drops cannot be unambiguously attributed to model performance rather than inherited label noise.
[Abstract] Abstract: no information is supplied on curation criteria used to select the 1,088,385 pairs, the train-test split methodology, or the training hyperparameters for the baseline models; these omissions prevent verification that the empirical results support the utility claims made for the dataset.

minor comments (1)

[Abstract] The abstract would be clearer if it explicitly listed the seven land-cover classes and the precise spatial extent (e.g., bounding box or country coverage) of the European portion.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback. We address each major comment below, indicating planned revisions where appropriate.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that BELDE supplies reliable large-scale training/evaluation data rests on ESA WorldCover supplying accurate 7-class labels, yet the manuscript reports no per-class accuracy figures, confusion matrices versus CORINE or national maps, or manual audit on any validation subset; without this, the reported 83.0% in-domain F1 and the cross-domain drops cannot be unambiguously attributed to model performance rather than inherited label noise.

Authors: We agree that label provenance and quality are central to interpreting the reported metrics. BELDE directly adopts the 7-class labels from the ESA WorldCover product without additional re-labeling. WorldCover has published global validation results (we will cite the relevant ESA technical reports in the revision). The manuscript does not contain independent per-class comparisons to CORINE, national maps, or a manual audit of a validation subset. We will add a new subsection on label source and limitations, including a brief qualitative review of label consistency on a small random subset of tiles, and will explicitly note that the 83.0% in-domain F1 reflects agreement with WorldCover labels rather than an absolute ground-truth accuracy. A full quantitative cross-product validation lies outside the scope of this dataset release paper. revision: partial
Referee: [Abstract] Abstract: no information is supplied on curation criteria used to select the 1,088,385 pairs, the train-test split methodology, or the training hyperparameters for the baseline models; these omissions prevent verification that the empirical results support the utility claims made for the dataset.

Authors: We acknowledge that the abstract is terse on these points. The full manuscript contains a Data Construction section describing curation (cloud-free Sentinel-2 tile selection, geographic tiling, and exclusion of low-quality WorldCover regions), a random tile-level train/validation/test split designed to minimize spatial leakage, and the exact training settings (optimizer, learning rate schedule, augmentation, and early stopping) used for the U-Net, DeepLabv3+, and other baselines. To improve clarity we will (i) add a concise summary of curation and split strategy to the abstract and (ii) ensure all hyperparameter values appear in a dedicated table or appendix. The accompanying code repository will release the exact configuration files and split indices. revision: yes

standing simulated objections not resolved

A comprehensive, continent-wide quantitative validation of WorldCover labels against CORINE or national maps would require a separate, resource-intensive study and is not feasible within the current manuscript.

Circularity Check

0 steps flagged

No circularity; dataset construction paper with no derivations or self-referential steps

full rationale

The manuscript constructs BELDE by pairing Sentinel-2 RGB tiles with ESA WorldCover labels and reports empirical baseline F1 scores on in-domain and cross-domain splits. No equations, parameter fits, or predictions are present. The central claims (dataset scale, 83% in-domain F1) are direct measurements or counts, not quantities derived from prior results by the paper's own logic. The reliance on WorldCover as ground truth is an external assumption subject to independent verification, not a self-definitional or fitted-input reduction. No self-citations are invoked as load-bearing uniqueness theorems. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical dataset-construction paper. No free parameters, mathematical axioms, or invented scientific entities are introduced; the seven land-cover classes follow the pre-existing ESA WorldCover taxonomy.

pith-pipeline@v0.9.1-grok · 5853 in / 1099 out tokens · 38235 ms · 2026-06-26T17:47:03.856377+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Benchmarking the Alignment of Data-Quality Metrics, Human Judgment and Land-Cover Segmentation Performance for Earth Observation
eess.IV 2026-06 unverdicted novelty 4.0

Automatic metrics such as FID are misaligned with human perception and downstream segmentation performance for Earth observation datasets and synthetic counterparts.

Reference graph

Works this paper leans on

32 extracted references · 3 linked inside Pith · cited by 1 Pith paper

[1]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Adimoolam, Y.K., Poullis, C., Averkiou, M.: Data leakage detection and de- duplication in large scale geospatial image datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 72–81 (2026)

2026
[2]

International Journal of Computer Vision133(11), 7672–7709 (2025)

Al-Emadi, S.A., Yang, Y., Ofli, F.: Analysing satellite imagery classification under spatial domain shift across geographic regions. International Journal of Computer Vision133(11), 7672–7709 (2025)

2025
[3]

In: Asian conference on computer vision

Audebert, N., Le Saux, B., Lefèvre, S.: Semantic segmentation of earth observation data using multimodal and multi-scale deep networks. In: Asian conference on computer vision. pp. 180–196. Springer (2016)

2016
[4]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops

Boguszewski, A., Batorski, D., Ziemba-Jankowska, N., Dziedzic, T., Zambrzycka, A.: LandCover.ai: Dataset for automatic mapping of buildings, woodlands, water and roads from aerial imagery. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. pp. 1102–1110 (June 2021)

2021
[5]

Scientific data9(1), 251 (2022)

Brown, C.F., Brumby, S.P., Guzder-Williams, B., Birch, T., Hyde, S.B., Mazzariello, J., Czerwinski, W., Pasquarella, V.J., Haertel, R., Ilyushchenko, S., et al.: Dynamic world, near real-time global 10 m land use land cover mapping. Scientific data9(1), 251 (2022)

2022
[6]

arXiv preprint arXiv:2603.09625 (2026)

Çağlar, Ü.M., Temizel, A.: Grounding synthetic data generation with vision and language models. arXiv preprint arXiv:2603.09625 (2026)

Pith/arXiv arXiv 2026
[7]

arXiv preprint arXiv:2606.02092 (2026)

Çağlar, Ü.M., Temizel, A.: LALE: Lightweight-transformer architecture for land- cover estimation. arXiv preprint arXiv:2606.02092 (2026)

Pith/arXiv arXiv 2026
[8]

In: 2017 IEEE visual communications and image processing (VCIP)

Chaurasia, A., Culurciello, E.: Linknet: Exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE visual communications and image processing (VCIP). pp. 1–4. IEEE (2017)

2017
[9]

arXiv preprint arXiv:1706.05587 (2017)

Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

Pith/arXiv arXiv 2017
[10]

In: ECCV (2018)

Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018)

2018
[11]

In: IGARSS 2025- 2025IEEEInternationalGeoscienceandRemoteSensingSymposium.pp.1264–1268

Clasen, K.N., Hackel, L., Burgert, T., Sumbul, G., Demir, B., Markl, V.: reBEN: Refined bigearthnet dataset for remote sensing image analysis. In: IGARSS 2025- 2025IEEEInternationalGeoscienceandRemoteSensingSymposium.pp.1264–1268. IEEE (2025)

2025
[12]

In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops

Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., Raskar, R.: Deepglobe 2018: A challenge to parse the earth through satellite images. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 172–181 (2018)

2018
[13]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

Helber, P., Bischke, B., Dengel, A., Borth, D.: EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

2019
[14]

In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

Hirayama, S., Tadono, T., Mizukami, Y., Ohki, M., Imamura, K., Hirade, N., Ohgushi, F., Dotsu, M., Yamanokuchi, T., Nasahara, K.N.: Generation of the high-resolution land-use and land-cover map in japan version 21.11. In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS). pp. 4339–
[15]

In: Proceedings of the IEEE/CVF international conference on computer vision

Li, Y., Hu, J., Wen, Y., Evangelidis, G., Salahi, K., Wang, Y., Tulyakov, S., Ren, J.: Rethinking vision transformers for mobilenet size and speed. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 16889–16900 (2023) BELDE 15

2023
[16]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2117–2125 (2017)

2017
[17]

ISPRS journal of photogrammetry and remote sensing152, 166–177 (2019)

Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., Johnson, B.A.: Deep learning in remote sensing applications: A meta-analysis and review. ISPRS journal of photogrammetry and remote sensing152, 166–177 (2019)

2019
[18]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Miranda, M., Pathak, D., Helber, P., Bischke, B., Najjar, H., Mena, F., Sanchez, C., Pai, A., Arenas, D., Valdenegro-Toro, M., Charfuelan, M., Nuske, M., Dengel, A.: Yieldsat: A multimodal benchmark dataset for high-resolution crop yield prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 22920–229...

2026
[19]

In: Proceedings of the IEEE/CVF international conference on computer vision

Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 12179–12188 (2021)

2021
[20]

In: International Conference on Medical image computing and computer-assisted intervention

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. pp. 234–241. Springer (2015)

2015
[21]

arXiv preprint arXiv:1902.06148 (2019)

Sumbul, G., Charfuelan, M., Demir, B., Markl, V.: Bigearthnet: A large-scale benchmark archive for remote sensing image understanding. arXiv preprint arXiv:1902.06148 (2019)

arXiv 1902
[22]

In: European conference on computer vision

Touvron, H., Cord, M., Jégou, H.: DeiT III: Revenge of the vit. In: European conference on computer vision. pp. 516–533. Springer (2022)

2022
[23]

Scientific reports14(1), 3926 (2024)

Truong, V.T., Hirayama, S., Phan, D.C., Hoang, T.T., Tadono, T., Nasahara, K.N.: Jaxa’s new high-resolution land use land cover map for vietnam using a time-feature convolutional neural network. Scientific reports14(1), 3926 (2024)

2024
[24]

In: European conference on computer vision

Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., Li, Y.: Maxvit: Multi-axis vision transformer. In: European conference on computer vision. pp. 459–479. Springer (2022)

2022
[25]

In: Proceedings of the IEEE/CVF international conference on computer vision

Vasu, P.K.A., Gabriel, J., Zhu, J., Tuzel, O., Ranjan, A.: FastViT: A fast hybrid vision transformer using structural reparameterization. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 5785–5795 (2023)

2023
[26]

In: Vanschoren, J., Yeung, S

Wang,J.,Zheng,Z.,Ma,A.,Lu,X.,Zhong,Y.:LoveDA:Aremotesensingland-cover dataset for domain adaptive semantic segmentation. In: Vanschoren, J., Yeung, S. (eds.) Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. vol. 1. Curran Associates, Inc. (2021)

2021
[27]

Advances in neural information processing systems34, 12077–12090 (2021)

Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems34, 12077–12090 (2021)

2021
[28]

Zanaga, D., Van De Kerchove, R., Daems, D., De Keersmaecker, W., Brockmann, C., Kirches, G., Wevers, J., Cartus, O., Santoro, M., Fritz, S., et al.: ESA worldcover 10 m 2021 v200 (2022)

2021
[29]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Zeng, L., Marsocci, V., Zhao, W., Nascetti, A., Vergauwen, M.: Neighbormae: Exploiting spatial dependencies between neighboring earth observation images in masked autoencoders pretraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20597–20607 (June 2026)

2026
[30]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Zhang, P., Zhang, Y., Xu, L., Lin, J., Guo, Z., Wang, F., Yang, X., Wei, K., Wang, L.: Geovis: Geospatially rewarded visual search for remote sensing visual grounding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 14335–14345 (June 2026) 16 Ü.M. Çağlar And A. Temizel

2026
[31]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2881–2890 (2017)

2017
[32]

In: International workshop on deep learning in medical image analysis (2018)

Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net architecture for medical image segmentation. In: International workshop on deep learning in medical image analysis (2018)

2018

[1] [1]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Adimoolam, Y.K., Poullis, C., Averkiou, M.: Data leakage detection and de- duplication in large scale geospatial image datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 72–81 (2026)

2026

[2] [2]

International Journal of Computer Vision133(11), 7672–7709 (2025)

Al-Emadi, S.A., Yang, Y., Ofli, F.: Analysing satellite imagery classification under spatial domain shift across geographic regions. International Journal of Computer Vision133(11), 7672–7709 (2025)

2025

[3] [3]

In: Asian conference on computer vision

Audebert, N., Le Saux, B., Lefèvre, S.: Semantic segmentation of earth observation data using multimodal and multi-scale deep networks. In: Asian conference on computer vision. pp. 180–196. Springer (2016)

2016

[4] [4]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops

Boguszewski, A., Batorski, D., Ziemba-Jankowska, N., Dziedzic, T., Zambrzycka, A.: LandCover.ai: Dataset for automatic mapping of buildings, woodlands, water and roads from aerial imagery. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops. pp. 1102–1110 (June 2021)

2021

[5] [5]

Scientific data9(1), 251 (2022)

Brown, C.F., Brumby, S.P., Guzder-Williams, B., Birch, T., Hyde, S.B., Mazzariello, J., Czerwinski, W., Pasquarella, V.J., Haertel, R., Ilyushchenko, S., et al.: Dynamic world, near real-time global 10 m land use land cover mapping. Scientific data9(1), 251 (2022)

2022

[6] [6]

arXiv preprint arXiv:2603.09625 (2026)

Çağlar, Ü.M., Temizel, A.: Grounding synthetic data generation with vision and language models. arXiv preprint arXiv:2603.09625 (2026)

Pith/arXiv arXiv 2026

[7] [7]

arXiv preprint arXiv:2606.02092 (2026)

Çağlar, Ü.M., Temizel, A.: LALE: Lightweight-transformer architecture for land- cover estimation. arXiv preprint arXiv:2606.02092 (2026)

Pith/arXiv arXiv 2026

[8] [8]

In: 2017 IEEE visual communications and image processing (VCIP)

Chaurasia, A., Culurciello, E.: Linknet: Exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE visual communications and image processing (VCIP). pp. 1–4. IEEE (2017)

2017

[9] [9]

arXiv preprint arXiv:1706.05587 (2017)

Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation. arXiv preprint arXiv:1706.05587 (2017)

Pith/arXiv arXiv 2017

[10] [10]

In: ECCV (2018)

Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: ECCV (2018)

2018

[11] [11]

In: IGARSS 2025- 2025IEEEInternationalGeoscienceandRemoteSensingSymposium.pp.1264–1268

Clasen, K.N., Hackel, L., Burgert, T., Sumbul, G., Demir, B., Markl, V.: reBEN: Refined bigearthnet dataset for remote sensing image analysis. In: IGARSS 2025- 2025IEEEInternationalGeoscienceandRemoteSensingSymposium.pp.1264–1268. IEEE (2025)

2025

[12] [12]

In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops

Demir, I., Koperski, K., Lindenbaum, D., Pang, G., Huang, J., Basu, S., Hughes, F., Tuia, D., Raskar, R.: Deepglobe 2018: A challenge to parse the earth through satellite images. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 172–181 (2018)

2018

[13] [13]

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

Helber, P., Bischke, B., Dengel, A., Borth, D.: EuroSAT: A novel dataset and deep learning benchmark for land use and land cover classification. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2019)

2019

[14] [14]

In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

Hirayama, S., Tadono, T., Mizukami, Y., Ohki, M., Imamura, K., Hirade, N., Ohgushi, F., Dotsu, M., Yamanokuchi, T., Nasahara, K.N.: Generation of the high-resolution land-use and land-cover map in japan version 21.11. In: IEEE International Geoscience and Remote Sensing Symposium (IGARSS). pp. 4339–

[15] [15]

In: Proceedings of the IEEE/CVF international conference on computer vision

Li, Y., Hu, J., Wen, Y., Evangelidis, G., Salahi, K., Wang, Y., Tulyakov, S., Ren, J.: Rethinking vision transformers for mobilenet size and speed. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 16889–16900 (2023) BELDE 15

2023

[16] [16]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Lin, T.Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2117–2125 (2017)

2017

[17] [17]

ISPRS journal of photogrammetry and remote sensing152, 166–177 (2019)

Ma, L., Liu, Y., Zhang, X., Ye, Y., Yin, G., Johnson, B.A.: Deep learning in remote sensing applications: A meta-analysis and review. ISPRS journal of photogrammetry and remote sensing152, 166–177 (2019)

2019

[18] [18]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Miranda, M., Pathak, D., Helber, P., Bischke, B., Najjar, H., Mena, F., Sanchez, C., Pai, A., Arenas, D., Valdenegro-Toro, M., Charfuelan, M., Nuske, M., Dengel, A.: Yieldsat: A multimodal benchmark dataset for high-resolution crop yield prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 22920–229...

2026

[19] [19]

In: Proceedings of the IEEE/CVF international conference on computer vision

Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 12179–12188 (2021)

2021

[20] [20]

In: International Conference on Medical image computing and computer-assisted intervention

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. pp. 234–241. Springer (2015)

2015

[21] [21]

arXiv preprint arXiv:1902.06148 (2019)

Sumbul, G., Charfuelan, M., Demir, B., Markl, V.: Bigearthnet: A large-scale benchmark archive for remote sensing image understanding. arXiv preprint arXiv:1902.06148 (2019)

arXiv 1902

[22] [22]

In: European conference on computer vision

Touvron, H., Cord, M., Jégou, H.: DeiT III: Revenge of the vit. In: European conference on computer vision. pp. 516–533. Springer (2022)

2022

[23] [23]

Scientific reports14(1), 3926 (2024)

Truong, V.T., Hirayama, S., Phan, D.C., Hoang, T.T., Tadono, T., Nasahara, K.N.: Jaxa’s new high-resolution land use land cover map for vietnam using a time-feature convolutional neural network. Scientific reports14(1), 3926 (2024)

2024

[24] [24]

In: European conference on computer vision

Tu, Z., Talebi, H., Zhang, H., Yang, F., Milanfar, P., Bovik, A., Li, Y.: Maxvit: Multi-axis vision transformer. In: European conference on computer vision. pp. 459–479. Springer (2022)

2022

[25] [25]

In: Proceedings of the IEEE/CVF international conference on computer vision

Vasu, P.K.A., Gabriel, J., Zhu, J., Tuzel, O., Ranjan, A.: FastViT: A fast hybrid vision transformer using structural reparameterization. In: Proceedings of the IEEE/CVF international conference on computer vision. pp. 5785–5795 (2023)

2023

[26] [26]

In: Vanschoren, J., Yeung, S

Wang,J.,Zheng,Z.,Ma,A.,Lu,X.,Zhong,Y.:LoveDA:Aremotesensingland-cover dataset for domain adaptive semantic segmentation. In: Vanschoren, J., Yeung, S. (eds.) Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks. vol. 1. Curran Associates, Inc. (2021)

2021

[27] [27]

Advances in neural information processing systems34, 12077–12090 (2021)

Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems34, 12077–12090 (2021)

2021

[28] [28]

Zanaga, D., Van De Kerchove, R., Daems, D., De Keersmaecker, W., Brockmann, C., Kirches, G., Wevers, J., Cartus, O., Santoro, M., Fritz, S., et al.: ESA worldcover 10 m 2021 v200 (2022)

2021

[29] [29]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Zeng, L., Marsocci, V., Zhao, W., Nascetti, A., Vergauwen, M.: Neighbormae: Exploiting spatial dependencies between neighboring earth observation images in masked autoencoders pretraining. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20597–20607 (June 2026)

2026

[30] [30]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Zhang, P., Zhang, Y., Xu, L., Lin, J., Guo, Z., Wang, F., Yang, X., Wei, K., Wang, L.: Geovis: Geospatially rewarded visual search for remote sensing visual grounding. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 14335–14345 (June 2026) 16 Ü.M. Çağlar And A. Temizel

2026

[31] [31]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 2881–2890 (2017)

2017

[32] [32]

In: International workshop on deep learning in medical image analysis (2018)

Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net architecture for medical image segmentation. In: International workshop on deep learning in medical image analysis (2018)

2018