arxiv: 2605.09746 · v1 · submitted 2026-05-10 · 💻 cs.LG · cs.AI

Recognition: no theorem link

Sequential Feature Selection for Efficient Landslide Segmentation from Multi-Spectral Data

Arsalaan Ahmad , Oktay Karakus , Paul L. Rosin

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:36 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords landslide segmentationfeature selectionmultispectral imagerySentinel-2SFFSremote sensingU-Net

0 comments

The pith

Sequential forward floating selection finds an 8-channel subset that matches or exceeds full 30-channel accuracy for landslide segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that current landslide models often ingest highly redundant multispectral and topographic inputs whose individual contributions are hard to interpret. By replacing single-band drop tests with Sequential Forward Floating Selection run inside a lightweight proxy network, it isolates a minimal 8-feature combination from Sentinel-2 bands, ALOS terrain layers, and derived indices. This subset delivers equivalent segmentation performance on the Landslide4Sense benchmark while exposing the physical cues the model actually uses. The result directly addresses the Hughes phenomenon and the computational waste of feeding every available channel into deep networks.

Core claim

Sequential Forward Floating Selection applied iteratively with a lightweight U-Net++ proxy identifies a compact 8-channel subset drawn from Sentinel-2 multispectral data, ALOS PALSAR topography, and 16 engineered spectral and structural indices. When this subset is supplied to a segmentation model, it achieves F1 scores equal to or higher than those obtained from the full pool of up to 30 channels. The selection trajectory itself reveals which spectral ratios and topographic derivatives carry the decisive information for distinguishing landslides.

What carries the argument

Sequential Forward Floating Selection (SFFS) that iteratively adds promising channels and removes redundant ones, evaluated at each step by a lightweight U-Net++ proxy model to capture interaction effects.

If this is right

Segmentation models can operate with substantially lower input dimensionality, reducing memory footprint and inference time.
Feature rankings become more trustworthy because SFFS explicitly tests combinations rather than isolated bands.
The same selection procedure can be reused on other remote-sensing segmentation tasks that suffer from high channel correlation.
Models become easier to interpret because the retained channels correspond to physically meaningful spectral and topographic cues.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the 8-channel set proves stable across geographic regions, it could inform the design of lightweight sensors optimized for landslide monitoring.
The method offers a practical route to test whether performance gains from additional bands are real or merely artifacts of over-parameterized inputs.
Extending SFFS to other proxy architectures would show whether the discovered minimal set is robust or architecture-dependent.

Load-bearing premise

The lightweight U-Net++ proxy used inside each SFFS iteration accurately reproduces the performance trends and interaction effects that would appear in the final full-scale segmentation model.

What would settle it

Train the final segmentation model on the exact 8-channel subset versus the full 30-channel set using identical training protocols and measure whether the F1 score on the official Landslide4Sense test split drops by more than the reported margin.

Figures

Figures reproduced from arXiv: 2605.09746 by Arsalaan Ahmad, Oktay Karakus, Paul L. Rosin.

**Figure 2.** Figure 2: Channel selection results. (a) SFFS-selected subset [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

Landslide detection from satellite imagery has advanced through deep learning, yet most models rely on large, highly correlated spectral-topographic inputs whose contributions remain poorly understood. The question of which channels are actually necessary has received surprisingly little attention. This matters: redundant or correlated inputs obscure physical interpretability, inflate computational overhead, and can actively degrade model performance through the Hughes Phenomenon. We present a systematic, explainable channel-selection framework for the Landslide4Sense benchmark, combining Sentinel-2 multispectral and ALOS PALSAR terrain data with 16 engineered spectral and structural indices. Rather than relying on conventional single-band drop tests, which evaluate channels in isolation and miss interaction effects, we apply Sequential Forward Floating Selection (SFFS) to iteratively build and prune a candidate feature pool using a lightweight U-Net++ proxy model. Beyond identifying a compact 8-channel subset that matches or exceeds the segmentation F1 of configurations using up to 30 channels, we use the selection process itself to interrogate which spectral and topographic features landslide models genuinely rely on, and what this reveals about the physical cues driving their predictions. We argue that SFFS represents a principled feature selection approach to input design in Earth observation, in contrast to the prevailing practice of appending every available band and hoping the model learns what to ignore.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies established sequential feature selection to trim inputs for landslide segmentation on one benchmark, claiming an 8-channel subset holds performance while adding some interpretability, but the proxy-to-full-model transfer is the unproven part.

read the letter

The paper takes Sequential Forward Floating Selection and runs it on the Landslide4Sense dataset. It mixes Sentinel-2 bands, ALOS terrain channels, and 16 engineered indices, then uses a lightweight U-Net++ proxy to iteratively add and remove features until it settles on an 8-channel subset. The claim is that this subset matches or exceeds the F1 scores of models trained on up to 30 channels, and that the selection path itself reveals which spectral and topographic cues the models actually use for landslides.

Referee Report

2 major / 3 minor

Summary. The manuscript presents a channel-selection framework for landslide segmentation on the Landslide4Sense benchmark. It combines Sentinel-2 multispectral bands, ALOS PALSAR topographic data, and 16 engineered spectral/structural indices into a 30-channel input pool, then applies Sequential Forward Floating Selection (SFFS) driven by a lightweight U-Net++ proxy to identify a compact 8-channel subset. The central claim is that this subset matches or exceeds the F1 score of models trained on the full input while the selection trajectory itself yields interpretable insights into the physical spectral and topographic cues that drive predictions. The work contrasts SFFS with single-band drop tests, arguing that the former better accounts for feature interactions and mitigates the Hughes phenomenon.

Significance. If the proxy-to-full-model transfer holds, the paper supplies a reproducible, interaction-aware method for input reduction in remote-sensing segmentation that simultaneously improves efficiency and interpretability. Credit is due for (i) replacing isolated ablation with SFFS, (ii) grounding the selection in a public benchmark, and (iii) attempting to extract physical insight from the selection path rather than treating feature selection as a pure black-box optimization step. Such contributions are valuable for the broader Earth-observation ML community where high-dimensional, correlated inputs remain the default.

major comments (2)

[§4 (Experimental Results)] §4 (Experimental Results) and proxy-model description: The headline claim that the SFFS-derived 8-channel subset 'matches or exceeds' the F1 of up to 30-channel configurations rests on the untested assumption that performance trends observed with the lightweight U-Net++ proxy transfer to the final full-scale segmentation architecture. Because the proxy has lower capacity, it may under-represent higher-order interactions among the 16 indices, Sentinel-2 bands, and topographic channels; the manuscript provides no direct ablation in which the target model is retrained and evaluated on the selected 8 channels versus the full 30-channel baseline, with error bars or statistical tests. This validation step is load-bearing for both the efficiency result and the subsequent physical-interpretability conclusions.
[Abstract and §5 (Discussion)] Abstract and §5 (Discussion): The interpretability narrative—that the SFFS trajectory reveals 'which spectral and topographic features landslide models genuinely rely on'—is presented without supporting evidence from the final model, such as permutation importance, SHAP values, or a controlled ablation on the selected subset. Without this link, the physical-cue conclusions risk being post-hoc attributions of the proxy rather than properties of the deployed model.

minor comments (3)

[§3.1] The exact composition and ordering of the initial 30-channel pool (Sentinel-2 bands + topographic channels + 16 indices) should be tabulated in §3.1 for reproducibility.
[Figures] Figure captions and axis labels in the SFFS trajectory plots would benefit from explicit indication of whether each step reports proxy or final-model F1.
[§4] A brief comparison to at least one alternative feature-selection baseline (e.g., recursive feature elimination or mutual-information ranking) would strengthen the methodological claim, even if placed in supplementary material.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback, which correctly identifies key gaps in empirical validation. We agree that direct testing on the target model and additional interpretability analyses are necessary to support the central claims. We outline below how we will revise the manuscript to address each point.

read point-by-point responses

Referee: [§4 (Experimental Results)] §4 (Experimental Results) and proxy-model description: The headline claim that the SFFS-derived 8-channel subset 'matches or exceeds' the F1 of up to 30-channel configurations rests on the untested assumption that performance trends observed with the lightweight U-Net++ proxy transfer to the final full-scale segmentation architecture. Because the proxy has lower capacity, it may under-represent higher-order interactions among the 16 indices, Sentinel-2 bands, and topographic channels; the manuscript provides no direct ablation in which the target model is retrained and evaluated on the selected 8 channels versus the full 30-channel baseline, with error bars or statistical tests. This validation step is load-bearing for both the efficiency result and the subsequent physical-interpretability conclusions.

Authors: We agree that this is a substantive limitation of the current manuscript. The SFFS procedure was performed exclusively with the lightweight proxy to keep the iterative search computationally tractable, and no direct comparison of the full-scale target architecture on the selected 8-channel subset versus the 30-channel baseline was reported. In the revised version we will add a dedicated ablation in §4: the target model will be retrained from scratch on the 8-channel subset (and on the full 30-channel input for reference), with results averaged over multiple random seeds, reported with standard deviation error bars, and accompanied by paired statistical tests. This will directly test transferability and quantify any efficiency gains on the deployed architecture. revision: yes
Referee: [Abstract and §5 (Discussion)] Abstract and §5 (Discussion): The interpretability narrative—that the SFFS trajectory reveals 'which spectral and topographic features landslide models genuinely rely on'—is presented without supporting evidence from the final model, such as permutation importance, SHAP values, or a controlled ablation on the selected subset. Without this link, the physical-cue conclusions risk being post-hoc attributions of the proxy rather than properties of the deployed model.

Authors: We concur that the physical-interpretability claims require grounding in the final model rather than solely in the proxy. In the revision we will add two analyses performed on the full-scale model trained with the 8-channel subset: (1) permutation importance rankings to measure the drop in F1 when each selected channel is shuffled, and (2) SHAP value summaries to visualize the contribution of each channel to landslide versus non-landslide predictions. We will also include a controlled leave-one-channel-out ablation on the 8-channel set. These results will be presented in §5 and referenced in the abstract to demonstrate that the features prioritized by SFFS are indeed relied upon by the deployed model. revision: yes

Circularity Check

0 steps flagged

No circularity in SFFS feature selection on public benchmark

full rationale

The paper applies the external Sequential Forward Floating Selection algorithm to the Landslide4Sense benchmark using a lightweight proxy model. The claim of an 8-channel subset matching or exceeding F1 performance is presented as an empirical outcome of that process rather than being mathematically forced by any equation or definition within the paper itself. No self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations appear in the provided description or abstract. The derivation chain remains independent of its own outputs and is self-contained against the external dataset.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The paper rests on the standard SFFS algorithm and the Landslide4Sense benchmark; the 8-channel selection is discovered empirically rather than derived from new axioms or entities.

free parameters (1)

Selected channel count = 8
The final subset size of 8 is determined by the SFFS stopping criterion on the proxy model performance.

axioms (2)

domain assumption Performance of the lightweight U-Net++ proxy during feature search correlates sufficiently with the final model to justify the selected subset.
Invoked when using the proxy to guide iterative addition and removal of channels.
domain assumption The Landslide4Sense benchmark provides representative multi-spectral and terrain inputs for real landslide segmentation tasks.
Used as the sole evaluation dataset without further justification in the abstract.

pith-pipeline@v0.9.0 · 5534 in / 1594 out tokens · 59543 ms · 2026-05-12T02:36:46.853842+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

[1]

Sentinel-2: ESA’s optical high-resolution mission for GMES operational services,

M. Drusch, U. Del Bello, S. Carlier, O. Colin, V . Fernandez, F. Gascon, B. Hoersch, C. Isola, P. Laberinti, P. Martimort, A. Meygret, F. Spoto, O. Sy, F. Marchese, and P. Bargellini, “Sentinel-2: ESA’s optical high-resolution mission for GMES operational services,”Remote Sensing of Environment, vol. 120, pp. 25–36, 2012, the Sentinel Missions - New Oppor...

work page 2012
[2]

U-Net: Convolutional net- works for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional net- works for biomedical image segmentation,” inMedical Image Comput- ing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, Eds. Cham: Springer International Publishing, 2015, pp. 234–241

work page 2015
[3]

A standardized catalogue of spectral indices to advance the use of remote sensing in Earth system research,

D. Montero, C. Aybar, M. D. Mahecha, F. Martinuzzi, M. Söchting, and S. Wieneke, “A standardized catalogue of spectral indices to advance the use of remote sensing in Earth system research,”Scientific Data, vol. 10, no. 197, 2023

work page 2023
[4]

RMAU-NET: A residual- multihead-attention U-Net architecture for landslide segmentation and detection from remote sensing images,

L. Pham, C. Le, H. Tang, K. Truong, T. Nguyen, J. Lampert, A. Schindler, M. Boyer, and S. Phan, “RMAU-NET: A residual- multihead-attention U-Net architecture for landslide segmentation and detection from remote sensing images,” 2025. [Online]. Available: https://arxiv.org/abs/2507.11143

work page arXiv 2025
[5]

Landslide detection and segmentation using remote sensing images and deep neural network,

C. Le, L. Pham, J. Lampert, M. Schlögl, and A. Schindler, “Landslide detection and segmentation using remote sensing images and deep neural network,” 2023. [Online]. Available: https://arxiv.org/abs/2312.16717

work page arXiv 2023
[6]

On the mean accuracy of statistical pattern recognizers,

G. Hughes, “On the mean accuracy of statistical pattern recognizers,” IEEE Transactions on Information Theory, vol. 14, no. 1, pp. 55–63, 1968

work page 1968
[7]

Feature selection for classification of hyperspectral data by SVM,

M. Pal and G. M. Foody, “Feature selection for classification of hyperspectral data by SVM,”IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 5, pp. 2297–2307, 2010

work page 2010
[8]

An introduction to variable and feature selection,

I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,”Journal of machine learning research, vol. 3, no. Mar, pp. 1157–1182, 2003

work page 2003
[9]

Landslide4Sense: Reference benchmark data and deep learning models for landslide detection,

O. Ghorbanzadeh, Y . Xu, P. Ghamisi, M. Kopp, and D. Kreil, “Landslide4Sense: Reference benchmark data and deep learning models for landslide detection,”IEEE Transactions on Geoscience and Remote Sensing, vol. 60, p. 1–17, 2022. [Online]. Available: http://dx.doi.org/10.1109/TGRS.2022.3215209

work page doi:10.1109/tgrs.2022.3215209 2022
[10]

Floating search methods in feature-selection,

P. Pudil, J. Novovicova, and J. Kittler, “Floating search methods in feature-selection,”Pattern Recognition Letters, vol. 15, no. 11, pp. 1119– 1125, 1994

work page 1994
[11]

UNet++: A nested U-Net architecture for medical image segmentation,

Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “UNet++: A nested U-Net architecture for medical image segmentation,” 2018. [Online]. Available: https://arxiv.org/abs/1807.10165

work page arXiv 2018
[12]

Deep residual learning for image recognition,

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778

work page 2016
[13]

Adam: A method for stochastic optimization,

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”

work page
[14]

Adam: A Method for Stochastic Optimization

[Online]. Available: https://arxiv.org/abs/1412.6980

work page internal anchor Pith review Pith/arXiv arXiv
[15]

Focal Loss for Dense Object Detection

T.-Y . Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal loss for dense object detection,” 2018. [Online]. Available: https: //arxiv.org/abs/1708.02002

work page Pith review arXiv 2018
[16]

Generalised Dice overlap as a deep learning loss function for highly un- balanced segmentations,

C. H. Sudre, W. Li, T. Vercauteren, S. Ourselin, and M. Jorge Cardoso, “Generalised Dice overlap as a deep learning loss function for highly un- balanced segmentations,” inInternational Workshop on Deep Learning in Medical Image Analysis. Springer, 2017, pp. 240–248

work page 2017
[17]

Random forests,

L. Breiman, “Random forests,”Machine learning, vol. 45, no. 1, pp. 5–32, 2001

work page 2001
[18]

PyTorch: An imperative style, high-performance deep learning library,

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antigaet al., “PyTorch: An imperative style, high-performance deep learning library,”Advances in neural information processing systems, vol. 32, 2019

work page 2019
[19]

Evaluation of Sentinel-2 red-edge bands for empirical estimation of green lai and chlorophyll content,

J. Delegido, J. Verrelst, L. Alonso, and J. Moreno, “Evaluation of Sentinel-2 red-edge bands for empirical estimation of green lai and chlorophyll content,”Sensors, vol. 11, no. 7, pp. 7063–7081, 2011

work page 2011
[20]

Deep learning in remote sensing applications: A meta-analysis and review,

L. Ma, Y . Liu, X. Zhang, Y . Ye, G. Yin, and B. A. Johnson, “Deep learning in remote sensing applications: A meta-analysis and review,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 152, pp. 166–177, 2019. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S0924271619301108

work page 2019
[21]

Dimensionality reduc- tion and classification of hyperspectral remote sensing image feature extraction,

H. Li, J. Cui, X. Zhang, Y . Han, and L. Cao, “Dimensionality reduc- tion and classification of hyperspectral remote sensing image feature extraction,”Remote Sensing, vol. 14, no. 18, p. 4579, 2022

work page 2022