Clay-CNN Hybrids: Leveraging Geospatial Foundation Models as Auxiliary Context for Landslide Detection

Huong Binh Vu

arxiv: 2606.14081 · v3 · pith:TD7DYQAMnew · submitted 2026-06-12 · 💻 cs.CV · cs.AI· cs.LG· eess.IV

Clay-CNN Hybrids: Leveraging Geospatial Foundation Models as Auxiliary Context for Landslide Detection

Huong Binh Vu This is my paper

Pith reviewed 2026-06-27 05:17 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.LGeess.IV

keywords landslide detectiongeospatial foundation modelsU-NetLoRAsemantic segmentationSentinel-2disaster mappingclass imbalance

0 comments

The pith

Hybrid U-Net with Clay GFM context as auxiliary input reaches 64.5% F1 on landslide segmentation, beating both Clay-only and plain U-Net baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates Clay v1.5, a geospatial foundation model, as a way to improve pixel-level landslide segmentation on the highly imbalanced Landslide4Sense benchmark. It tests three setups: Clay as the main encoder with terrain fusion, a U-Net that receives Clay semantic features at the bottleneck, and a standard U-Net. The hybrid U-Net plus two-stage LoRA on Clay yields the highest test F1 of 64.5 percent, compared with 55.2 percent for Clay alone and 59.9 percent for the baseline. Clay underperforms when used standalone because it lacks multi-scale skip connections, yet its pretrained representations reliably help when added as extra context. The results indicate that foundation models deliver the most value for this task when they supplement rather than substitute for convolutional architectures that preserve spatial detail.

Core claim

The hybrid U-Net + Clay model with two-stage Low-Rank Adaptation (LoRA) achieved the best test F1 of 64.5 +/- 1.8% over three seeds, surpassing the Clay-only backbone (55.2 +/- 3.6%) and the U-Net baseline (59.9%). Clay as a standalone encoder underperformed the U-Net due to the absence of multi-scale skip connections, but its pretrained representations consistently improved performance when injected as auxiliary context. These findings suggest that GFMs are most effective for landslide detection when they complement spatially detailed convolutional architectures rather than replace them.

What carries the argument

Two-stage Low-Rank Adaptation (LoRA) that injects Clay semantic context into the U-Net bottleneck as auxiliary input while preserving the convolutional skip connections.

If this is right

GFMs improve landslide segmentation most when supplied as auxiliary context rather than used as the sole encoder.
Two-stage LoRA provides an effective way to adapt a pretrained GFM inside a hybrid segmentation pipeline.
Convolutional skip connections remain necessary for spatially precise outputs even when strong semantic features are available.
Pretrained geospatial representations help mitigate extreme class imbalance in post-event mapping tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same hybrid pattern may transfer to other geospatial segmentation problems that share high class imbalance and multi-band imagery.
Alternative fusion locations or other GFMs could be tested to see if bottleneck injection is optimal.
The performance gap suggests that real-time disaster pipelines could adopt lightweight LoRA adapters on existing CNN backbones without full model replacement.

Load-bearing premise

The measured F1 gains are caused by Clay's semantic context rather than by the specific two-stage LoRA schedule, random seed variation, or unstated differences in training protocol or data augmentation.

What would settle it

Retraining the hybrid architecture with the identical two-stage LoRA schedule but replacing Clay embeddings with random vectors or embeddings from an unrelated model and checking whether the F1 advantage disappears.

Figures

Figures reproduced from arXiv: 2606.14081 by Huong Binh Vu.

**Figure 2.** Figure 2: Exploratory data analysis. (A) Mean spectral signatures of landslide vs. background pixels across the 14 input bands. (B) Per-band separability index [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Architecture 1: Clay as primary encoder with multi-scale terrain fusion. (A) The Clay v1.5 ViT encoder design from Kaushik [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Architecture 2: Hybrid U-Net + Clay. The U-Net backbone encodes all 14 bands through four downsampling stages with full skip connections, while [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Test-set performance of Architecture 2 (Model 7a, representative [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Qualitative predictions of Architecture 2 on six test chips ordered by decreasing positive pixel fraction (70.42%–4.99%; representative seed 42). [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: Signed Grad-CAM fusion analysis for Architecture 2 on five test chips spanning dense to empty scar coverage. Columns: ground truth mask, binary [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

Rapid post-event landslide mapping is essential for disaster response but remains difficult to automate due to extreme class imbalance. This study evaluates whether Clay v1.5, a Geospatial Foundation Model (GFM), can improve pixel-level landslide segmentation on the Landslide4Sense (L4S) benchmark, which contains 3,799 training chips with 14 Sentinel-2 and terrain bands and approximately 2% positive pixels. We compare three strategies: Clay as the primary encoder with multi-scale residual terrain fusion, a U-Net backbone augmented with Clay semantic context at the bottleneck, and a standard U-Net baseline. The hybrid U-Net + Clay model with two-stage Low-Rank Adaptation (LoRA) achieved the best test F1 of 64.5 +/- 1.8% over three seeds, surpassing the Clay-only backbone (55.2 +/- 3.6%) and the U-Net baseline (59.9%). Clay as a standalone encoder underperformed the U-Net due to the absence of multi-scale skip connections, but its pretrained representations consistently improved performance when injected as auxiliary context. These findings suggest that GFMs are most effective for landslide detection when they complement spatially detailed convolutional architectures rather than replace them.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Modest F1 lift on Landslide4Sense from Clay-injected U-Net via two-stage LoRA, but missing methods leave the cause unclear.

read the letter

The headline result is a 64.5 F1 on the Landslide4Sense test set from a U-Net that receives Clay features at the bottleneck, using two-stage LoRA. That edges out the plain U-Net at 59.9 and the Clay-only encoder at 55.2, with standard deviations from three seeds. The paper tests the practical claim that geospatial foundation models help most when they supply semantic context to a convolutional backbone rather than replace it.

The work is straightforward on its own terms. It runs the three-way comparison on a real imbalanced segmentation task and notes that the Clay-only model suffers from missing skip connections. Reporting means and standard deviations over seeds is better than single-run claims. The hybrid recipe itself is not previously documented for this exact benchmark.

The soft spot is the absence of training details. The abstract supplies no optimizer settings, augmentation pipeline, positive-pixel weighting, or ablation that isolates the two-stage LoRA schedule from the feature injection. If the U-Net baseline and Clay-only runs did not receive identical adaptation and data handling, the observed delta cannot be credited to Clay context. The stress-test concern holds on the information given.

This paper is for remote-sensing practitioners who already work with U-Nets on disaster mapping and want a concrete recipe to try. It does not claim to reshape broader computer vision. It deserves a serious referee once the methods section supplies the missing protocol and ablations; without them the central attribution remains unverified.

Referee Report

2 major / 0 minor

Summary. The manuscript evaluates Clay v1.5, a geospatial foundation model, for pixel-level landslide segmentation on the Landslide4Sense benchmark (3,799 training chips, ~2% positive pixels). It compares three strategies: Clay as primary encoder with multi-scale residual terrain fusion, a standard U-Net baseline, and a hybrid U-Net that injects Clay features at the bottleneck. All use two-stage LoRA in the reported hybrid configuration. The hybrid achieves the highest test F1 of 64.5 ± 1.8% over three seeds, outperforming Clay-only (55.2 ± 3.6%) and U-Net (59.9%). The authors conclude that GFMs improve performance most effectively when used as auxiliary context to CNNs rather than as standalone encoders, owing to the absence of skip connections in the Clay backbone.

Significance. If the reported F1 gains can be attributed to Clay semantic context under controlled conditions, the work would offer concrete empirical guidance on hybrid GFM-CNN designs for class-imbalanced remote-sensing segmentation. The explicit multi-seed reporting with standard deviations and direct baseline comparisons is a positive methodological feature that strengthens reproducibility of the headline numbers.

major comments (2)

[Abstract] Abstract: The central claim that the 4.6-point F1 gain of the hybrid over the U-Net baseline is caused by Clay semantic context requires that the U-Net baseline and Clay-only models received identical training protocols (two-stage LoRA schedule, optimizer, augmentation pipeline, and positive-pixel weighting). No such protocol equivalence is stated or demonstrated, rendering the attribution to Clay features unverifiable from the reported results.
[Results] Results paragraph: No ablation is presented that isolates the contribution of the two-stage LoRA schedule from the Clay feature injection itself. Because the hybrid is the only configuration explicitly described as using two-stage LoRA, the observed delta could arise from the adaptation procedure rather than from the GFM context, directly undermining the conclusion that GFMs are “most effective when they complement” CNNs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback emphasizing the need for explicit protocol details and ablation clarity. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core findings.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the 4.6-point F1 gain of the hybrid over the U-Net baseline is caused by Clay semantic context requires that the U-Net baseline and Clay-only models received identical training protocols (two-stage LoRA schedule, optimizer, augmentation pipeline, and positive-pixel weighting). No such protocol equivalence is stated or demonstrated, rendering the attribution to Clay features unverifiable from the reported results.

Authors: We agree that protocol equivalence must be stated explicitly for the attribution to be verifiable. All three configurations were trained under the same optimizer, augmentation pipeline, positive-pixel weighting, and epoch schedule; two-stage LoRA was applied to both the Clay-only and hybrid models (as required for GFM adaptation), while the U-Net baseline used standard fine-tuning. We will revise the manuscript to add an explicit methods paragraph and comparison table confirming these shared settings across models. revision: yes
Referee: [Results] Results paragraph: No ablation is presented that isolates the contribution of the two-stage LoRA schedule from the Clay feature injection itself. Because the hybrid is the only configuration explicitly described as using two-stage LoRA, the observed delta could arise from the adaptation procedure rather than from the GFM context, directly undermining the conclusion that GFMs are “most effective when they complement” CNNs.

Authors: The referee correctly notes that the current text only highlights two-stage LoRA for the hybrid, leaving open the possibility that LoRA itself drives part of the gain. Clay-only also uses LoRA for adaptation, and the hybrid still outperforms it, but we acknowledge the lack of a pure U-Net + LoRA control. We will revise the results section to clarify LoRA usage per model and add a short ablation (U-Net with two-stage LoRA, no Clay) if space allows; otherwise we will note this as a limitation and qualify the conclusion accordingly. revision: partial

Circularity Check

0 steps flagged

No circularity; purely empirical benchmark comparison

full rationale

The manuscript reports F1 scores from training three segmentation models (hybrid U-Net+Clay, Clay-only, U-Net baseline) on the fixed Landslide4Sense dataset and evaluating on a held-out test set. No equations, derivations, or first-principles results appear. The central claim rests on direct experimental deltas against explicitly described baselines rather than any reduction of outputs to fitted inputs or self-citations. No self-definitional, fitted-prediction, or uniqueness-theorem patterns are present. This is standard empirical ML evaluation and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The performance claim rests on the public Landslide4Sense benchmark being representative and on standard supervised segmentation training assumptions; no new entities or ad-hoc constants are introduced beyond routine ML hyperparameters.

free parameters (1)

LoRA rank and learning rate schedule
Two-stage LoRA adaptation introduces rank and learning-rate choices that are fitted during training; exact values not stated in abstract.

axioms (1)

domain assumption The Landslide4Sense dataset split and annotation protocol constitute a fair test of generalization for post-event landslide mapping.
All reported numbers are computed on this single benchmark.

pith-pipeline@v0.9.1-grok · 5754 in / 1269 out tokens · 22746 ms · 2026-06-27T05:17:42.702724+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

32 extracted references · 12 canonical work pages · 3 internal anchors

[1]

Understanding fatal landslides at global scales: A summary of topographic, climatic, and anthropogenic perspectives,

S. Fidanet al., “Understanding fatal landslides at global scales: A summary of topographic, climatic, and anthropogenic perspectives,”Nat. Hazards, vol. 120, pp. 6437–6455, 2024

2024
[2]

Quantitative risk analysis for earthquake-induced landslides—Emamzadeh Ali, Iran,

S. M. Mousaviet al., “Quantitative risk analysis for earthquake-induced landslides—Emamzadeh Ali, Iran,”Eng. Geol., vol. 122, no. 3–4, pp. 191– 203, 2011

2011
[3]

Landslide susceptibility in cemented volcanic soils, Ask region, Iran,

S. M. Mousavi, “Landslide susceptibility in cemented volcanic soils, Ask region, Iran,”Indian Geotech. J., vol. 47, no. 1, pp. 115–130, 2017

2017
[4]

Landslides in a changing world,

I. Alc´antara-Ayala, “Landslides in a changing world,”Landslides, vol. 22, pp. 2851–2865, 2025

2025
[5]

Climate change could trigger more landslides in high mountain Asia,

“Climate change could trigger more landslides in high mountain Asia,”NOAA Research, Feb. 11,
[6]

Available: https://research.noaa.gov/ climate-change-could-trigger-more-landslides-in-high-mountain-asia/

[Online]. Available: https://research.noaa.gov/ climate-change-could-trigger-more-landslides-in-high-mountain-asia/
[7]

Rapid landslide detection from free optical satellite imagery using a robust change detection technique,

R. Coluzziet al., “Rapid landslide detection from free optical satellite imagery using a robust change detection technique,”Sci. Rep., vol. 15, Art. no. 4697, 2025

2025
[8]

Land- slide4Sense: Reference benchmark data and deep learning models for landslide detection,

O. Ghorbanzadeh, Y . Xu, P. Ghamisi, M. Kopp, and D. Kreil, “Land- slide4Sense: Reference benchmark data and deep learning models for landslide detection,”IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–17, 2022. 9

2022
[9]

Enhancing landslide detection in Western Ghats of Kerala, India with deep learning and explainable AI,

A. Sreekumaret al., “Enhancing landslide detection in Western Ghats of Kerala, India with deep learning and explainable AI,”Sci. Rep., vol. 15, Art. no. 45151, 2025, doi: 10.1038/s41598-025-33065-9

work page doi:10.1038/s41598-025-33065-9 2025
[10]

Brief communication: AI-driven rapid landslide mapping following the 2024 Hualien earthquake in Taiwan,

L. Navaet al., “Brief communication: AI-driven rapid landslide mapping following the 2024 Hualien earthquake in Taiwan,”Nat. Hazards Earth Syst. Sci., vol. 25, pp. 2371–2377, 2025, doi: 10.5194/nhess-25-2371- 2025

work page doi:10.5194/nhess-25-2371- 2024
[11]

A feature fusion method on landslide identification in remote sensing with Segment Anything Model,

C. Yanget al., “A feature fusion method on landslide identification in remote sensing with Segment Anything Model,”Landslides, vol. 22, pp. 471–483, 2025, doi: 10.1007/s10346-024-02390-x

work page doi:10.1007/s10346-024-02390-x 2025
[12]

Segment Anything,

A. Kirillovet al., “Segment Anything,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2023

2023
[13]

Prithvi-EO-2.0: A versatile multi-temporal foundation model for Earth observation applications,

D. Szwarcmanet al., “Prithvi-EO-2.0: A versatile multi-temporal foundation model for Earth observation applications,”arXiv preprint arXiv:2412.02732, Mar. 2026

work page arXiv 2026
[14]

Clay foundation model,

Clay Foundation, “Clay foundation model,” 2024. [Online]. Available: https://clay-foundation.github.io/model/

2024
[15]

Assessing the value of geo-foundational models for flood inundation mapping: Benchmarking models for Sentinel-1, Sentinel- 2, and PlanetScope for end-users,

S. Kaushiket al., “Assessing the value of geo-foundational models for flood inundation mapping: Benchmarking models for Sentinel-1, Sentinel- 2, and PlanetScope for end-users,”arXiv preprint arXiv:2511.01990, Jan. 2026

work page arXiv 2026
[16]

Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,

Y . Gal and Z. Ghahramani, “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,” inProc. 33rd Int. Conf. Mach. Learn. (ICML), 2016, pp. 1050–1059

2016
[17]

Saito, Z

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh and D. Batra, “Grad-CAM: Visual explanations from deep networks via gradient- based localization,”Int. J. Comput. Vis., vol. 128, no. 2, pp. 336–359, 2020, doi: 10.1109/ICCV .2017.74

work page doi:10.1109/iccv 2020
[18]

Landslide4Sense dataset (v1.0),

IBM NASA Geospatial, “Landslide4Sense dataset (v1.0),” Hugging Face, 2024. [Online]. Available: https://huggingface.co/datasets/harshinde/ LandSlide4Sense

2024
[19]

The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery,

O. Ghorbanzadehet al., “The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery,”IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 15, pp. 9927–9942, 2022, doi: 10.1109/JSTARS.2022.3220845

work page doi:10.1109/jstars.2022.3220845 2022
[20]

Landslide4Sense-2022: Data description and baseline code for Land- Slide4Sense 2022 competition,

Institute of Advanced Research in Artificial Intelligence (IARAI), “Landslide4Sense-2022: Data description and baseline code for Land- Slide4Sense 2022 competition,” GitHub, 2022. [Online]. Available: https://github.com/iarai/Landslide4Sense-2022

2022
[21]

Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,

K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1026–1034

2015
[22]

The Lov\'asz Hinge: A Novel Convex Surrogate for Submodular Losses

J. Yu and M. Blaschko, “The Lov ´asz hinge: A novel convex surrogate for submodular losses,”arXiv preprint arXiv:1512.07797, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[23]

Two-stage training strategy combined with neural network for segmentation of internal mammary artery graft,

S. Sunet al., “Two-stage training strategy combined with neural network for segmentation of internal mammary artery graft,”Biomed. Signal Process. Control, vol. 80, Art. no. 104278, Feb. 2023

2023
[24]

C. J. Van Rijsbergen,Information Retrieval, 2nd ed. London, U.K.: Butterworths, 1979

1979
[25]

Beyond temperature scaling: Obtaining well-calibrated multi- class probabilities with Dirichlet calibration,

M. Kull, M. Perell ´o-Nieto, M. K ¨angsepp, T. Silva Filho, H. Song, and P. Flach, “Beyond temperature scaling: Obtaining well-calibrated multi- class probabilities with Dirichlet calibration,” inAdv. Neural Inf. Process. Syst., vol. 32, 2019

2019
[26]

CTFNet: CNN-Transformer fusion network for remote-sensing image semantic segmentation,

H. Wu, P. Huang, M. Zhang, and W. Tang, “CTFNet: CNN-Transformer fusion network for remote-sensing image semantic segmentation,”IEEE Geosci. Remote Sens. Lett., vol. 21, Art. no. 5000305, pp. 1–5, 2024, doi: 10.1109/LGRS.2023.3336061

work page doi:10.1109/lgrs.2023.3336061 2024
[27]

DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation,

Y . Liao, T. Zhou, L. Li, J. Li, J. Shen, and A. Hamdulla, “DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation,”PeerJ Comput. Sci., vol. 11, Art. no. e2786, 2025

2025
[28]

Landslide segmentation with deep learning: Evaluating model generalization in rainfall-induced landslides in Brazil,

L. P. Soares, H. C. Dias, G. P. B. Garcia, and C. H. Grohmann, “Landslide segmentation with deep learning: Evaluating model generalization in rainfall-induced landslides in Brazil,”Remote Sens., vol. 16, no. 22, Art. no. 4344, 2024

2024
[29]

Z. Renet al., “Enhancing deep learning-based landslide detection from open satellite imagery via multisource data fusion of spectral, textural, and topographical features: A case study of old landslide detection in the Three Gorges Reservoir Area (TGRA),”Geomat. Nat. Hazards Risk, vol. 16, no. 1, Art. no. 2421224, 2025

2025
[30]

Semi-automatic mapping of shallow landslides using free Sentinel-2 images and Google Earth Engine,

D. Nottiet al., “Semi-automatic mapping of shallow landslides using free Sentinel-2 images and Google Earth Engine,”Nat. Hazards Earth Syst. Sci., vol. 23, no. 7, pp. 2625–2648, 2023, doi: 10.5194/nhess-23- 2625-2023

work page doi:10.5194/nhess-23- 2023
[31]

Landslide Detection and Mapping Using Deep Learning Across Multi-Source Satellite Data and Geographic Regions

R. A. Burange, H. K. Shinde, and O. Mutyalwar, “Landslide detection and mapping using deep learning across multi-source satellite data and geographic regions,”arXiv preprint arXiv:2507.01123, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Multi-Modal Landslide Detection from Sentinel-1 SAR and Sentinel-2 Optical Imagery Using Multi-Encoder Vision Transformers and Ensemble Learning

I. Nasios, “Multi-modal landslide detection from Sentinel-1 SAR and Sentinel-2 optical imagery using multi-encoder vision transformers and ensemble learning,”arXiv preprint arXiv:2604.05959, Apr. 2026. Binh Huong Vureceived a B.A. in Economics with a minor in Computer Science from Harvard University. Her research interests lie at the intersection of machi...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[1] [1]

Understanding fatal landslides at global scales: A summary of topographic, climatic, and anthropogenic perspectives,

S. Fidanet al., “Understanding fatal landslides at global scales: A summary of topographic, climatic, and anthropogenic perspectives,”Nat. Hazards, vol. 120, pp. 6437–6455, 2024

2024

[2] [2]

Quantitative risk analysis for earthquake-induced landslides—Emamzadeh Ali, Iran,

S. M. Mousaviet al., “Quantitative risk analysis for earthquake-induced landslides—Emamzadeh Ali, Iran,”Eng. Geol., vol. 122, no. 3–4, pp. 191– 203, 2011

2011

[3] [3]

Landslide susceptibility in cemented volcanic soils, Ask region, Iran,

S. M. Mousavi, “Landslide susceptibility in cemented volcanic soils, Ask region, Iran,”Indian Geotech. J., vol. 47, no. 1, pp. 115–130, 2017

2017

[4] [4]

Landslides in a changing world,

I. Alc´antara-Ayala, “Landslides in a changing world,”Landslides, vol. 22, pp. 2851–2865, 2025

2025

[5] [5]

Climate change could trigger more landslides in high mountain Asia,

“Climate change could trigger more landslides in high mountain Asia,”NOAA Research, Feb. 11,

[6] [6]

Available: https://research.noaa.gov/ climate-change-could-trigger-more-landslides-in-high-mountain-asia/

[Online]. Available: https://research.noaa.gov/ climate-change-could-trigger-more-landslides-in-high-mountain-asia/

[7] [7]

Rapid landslide detection from free optical satellite imagery using a robust change detection technique,

R. Coluzziet al., “Rapid landslide detection from free optical satellite imagery using a robust change detection technique,”Sci. Rep., vol. 15, Art. no. 4697, 2025

2025

[8] [8]

Land- slide4Sense: Reference benchmark data and deep learning models for landslide detection,

O. Ghorbanzadeh, Y . Xu, P. Ghamisi, M. Kopp, and D. Kreil, “Land- slide4Sense: Reference benchmark data and deep learning models for landslide detection,”IEEE Trans. Geosci. Remote Sens., vol. 60, pp. 1–17, 2022. 9

2022

[9] [9]

Enhancing landslide detection in Western Ghats of Kerala, India with deep learning and explainable AI,

A. Sreekumaret al., “Enhancing landslide detection in Western Ghats of Kerala, India with deep learning and explainable AI,”Sci. Rep., vol. 15, Art. no. 45151, 2025, doi: 10.1038/s41598-025-33065-9

work page doi:10.1038/s41598-025-33065-9 2025

[10] [10]

Brief communication: AI-driven rapid landslide mapping following the 2024 Hualien earthquake in Taiwan,

L. Navaet al., “Brief communication: AI-driven rapid landslide mapping following the 2024 Hualien earthquake in Taiwan,”Nat. Hazards Earth Syst. Sci., vol. 25, pp. 2371–2377, 2025, doi: 10.5194/nhess-25-2371- 2025

work page doi:10.5194/nhess-25-2371- 2024

[11] [11]

A feature fusion method on landslide identification in remote sensing with Segment Anything Model,

C. Yanget al., “A feature fusion method on landslide identification in remote sensing with Segment Anything Model,”Landslides, vol. 22, pp. 471–483, 2025, doi: 10.1007/s10346-024-02390-x

work page doi:10.1007/s10346-024-02390-x 2025

[12] [12]

Segment Anything,

A. Kirillovet al., “Segment Anything,” inProc. IEEE/CVF Int. Conf. Comput. Vis. (ICCV), 2023

2023

[13] [13]

Prithvi-EO-2.0: A versatile multi-temporal foundation model for Earth observation applications,

D. Szwarcmanet al., “Prithvi-EO-2.0: A versatile multi-temporal foundation model for Earth observation applications,”arXiv preprint arXiv:2412.02732, Mar. 2026

work page arXiv 2026

[14] [14]

Clay foundation model,

Clay Foundation, “Clay foundation model,” 2024. [Online]. Available: https://clay-foundation.github.io/model/

2024

[15] [15]

Assessing the value of geo-foundational models for flood inundation mapping: Benchmarking models for Sentinel-1, Sentinel- 2, and PlanetScope for end-users,

S. Kaushiket al., “Assessing the value of geo-foundational models for flood inundation mapping: Benchmarking models for Sentinel-1, Sentinel- 2, and PlanetScope for end-users,”arXiv preprint arXiv:2511.01990, Jan. 2026

work page arXiv 2026

[16] [16]

Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,

Y . Gal and Z. Ghahramani, “Dropout as a Bayesian approximation: Representing model uncertainty in deep learning,” inProc. 33rd Int. Conf. Mach. Learn. (ICML), 2016, pp. 1050–1059

2016

[17] [17]

Saito, Z

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh and D. Batra, “Grad-CAM: Visual explanations from deep networks via gradient- based localization,”Int. J. Comput. Vis., vol. 128, no. 2, pp. 336–359, 2020, doi: 10.1109/ICCV .2017.74

work page doi:10.1109/iccv 2020

[18] [18]

Landslide4Sense dataset (v1.0),

IBM NASA Geospatial, “Landslide4Sense dataset (v1.0),” Hugging Face, 2024. [Online]. Available: https://huggingface.co/datasets/harshinde/ LandSlide4Sense

2024

[19] [19]

The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery,

O. Ghorbanzadehet al., “The Outcome of the 2022 Landslide4Sense Competition: Advanced Landslide Detection from Multi-Source Satellite Imagery,”IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 15, pp. 9927–9942, 2022, doi: 10.1109/JSTARS.2022.3220845

work page doi:10.1109/jstars.2022.3220845 2022

[20] [20]

Landslide4Sense-2022: Data description and baseline code for Land- Slide4Sense 2022 competition,

Institute of Advanced Research in Artificial Intelligence (IARAI), “Landslide4Sense-2022: Data description and baseline code for Land- Slide4Sense 2022 competition,” GitHub, 2022. [Online]. Available: https://github.com/iarai/Landslide4Sense-2022

2022

[21] [21]

Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,

K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification,” in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), 2015, pp. 1026–1034

2015

[22] [22]

The Lov\'asz Hinge: A Novel Convex Surrogate for Submodular Losses

J. Yu and M. Blaschko, “The Lov ´asz hinge: A novel convex surrogate for submodular losses,”arXiv preprint arXiv:1512.07797, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[23] [23]

Two-stage training strategy combined with neural network for segmentation of internal mammary artery graft,

S. Sunet al., “Two-stage training strategy combined with neural network for segmentation of internal mammary artery graft,”Biomed. Signal Process. Control, vol. 80, Art. no. 104278, Feb. 2023

2023

[24] [24]

C. J. Van Rijsbergen,Information Retrieval, 2nd ed. London, U.K.: Butterworths, 1979

1979

[25] [25]

Beyond temperature scaling: Obtaining well-calibrated multi- class probabilities with Dirichlet calibration,

M. Kull, M. Perell ´o-Nieto, M. K ¨angsepp, T. Silva Filho, H. Song, and P. Flach, “Beyond temperature scaling: Obtaining well-calibrated multi- class probabilities with Dirichlet calibration,” inAdv. Neural Inf. Process. Syst., vol. 32, 2019

2019

[26] [26]

CTFNet: CNN-Transformer fusion network for remote-sensing image semantic segmentation,

H. Wu, P. Huang, M. Zhang, and W. Tang, “CTFNet: CNN-Transformer fusion network for remote-sensing image semantic segmentation,”IEEE Geosci. Remote Sens. Lett., vol. 21, Art. no. 5000305, pp. 1–5, 2024, doi: 10.1109/LGRS.2023.3336061

work page doi:10.1109/lgrs.2023.3336061 2024

[27] [27]

DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation,

Y . Liao, T. Zhou, L. Li, J. Li, J. Shen, and A. Hamdulla, “DGCFNet: Dual Global Context Fusion Network for remote sensing image semantic segmentation,”PeerJ Comput. Sci., vol. 11, Art. no. e2786, 2025

2025

[28] [28]

Landslide segmentation with deep learning: Evaluating model generalization in rainfall-induced landslides in Brazil,

L. P. Soares, H. C. Dias, G. P. B. Garcia, and C. H. Grohmann, “Landslide segmentation with deep learning: Evaluating model generalization in rainfall-induced landslides in Brazil,”Remote Sens., vol. 16, no. 22, Art. no. 4344, 2024

2024

[29] [29]

Z. Renet al., “Enhancing deep learning-based landslide detection from open satellite imagery via multisource data fusion of spectral, textural, and topographical features: A case study of old landslide detection in the Three Gorges Reservoir Area (TGRA),”Geomat. Nat. Hazards Risk, vol. 16, no. 1, Art. no. 2421224, 2025

2025

[30] [30]

Semi-automatic mapping of shallow landslides using free Sentinel-2 images and Google Earth Engine,

D. Nottiet al., “Semi-automatic mapping of shallow landslides using free Sentinel-2 images and Google Earth Engine,”Nat. Hazards Earth Syst. Sci., vol. 23, no. 7, pp. 2625–2648, 2023, doi: 10.5194/nhess-23- 2625-2023

work page doi:10.5194/nhess-23- 2023

[31] [31]

Landslide Detection and Mapping Using Deep Learning Across Multi-Source Satellite Data and Geographic Regions

R. A. Burange, H. K. Shinde, and O. Mutyalwar, “Landslide detection and mapping using deep learning across multi-source satellite data and geographic regions,”arXiv preprint arXiv:2507.01123, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Multi-Modal Landslide Detection from Sentinel-1 SAR and Sentinel-2 Optical Imagery Using Multi-Encoder Vision Transformers and Ensemble Learning

I. Nasios, “Multi-modal landslide detection from Sentinel-1 SAR and Sentinel-2 optical imagery using multi-encoder vision transformers and ensemble learning,”arXiv preprint arXiv:2604.05959, Apr. 2026. Binh Huong Vureceived a B.A. in Economics with a minor in Computer Science from Harvard University. Her research interests lie at the intersection of machi...

work page internal anchor Pith review Pith/arXiv arXiv 2026