Accurate and Robust Generative Approach for Overcoming Data Sparsity and Imbalance in Landslide Modeling with A Tabular Foundation Model

Gang Mei; Jianbing Peng; Kaixuan Shao; Nengxiong Xu; Yinghan Wu

arxiv: 2604.25159 · v1 · submitted 2026-04-28 · 💻 cs.LG

Accurate and Robust Generative Approach for Overcoming Data Sparsity and Imbalance in Landslide Modeling with A Tabular Foundation Model

Kaixuan Shao , Gang Mei , Yinghan Wu , Nengxiong Xu , Jianbing Peng This is my paper

Pith reviewed 2026-05-07 16:37 UTC · model grok-4.3

classification 💻 cs.LG

keywords landslide modelingdata generationtabular foundation modeldata sparsitydata imbalancesusceptibility modelinggenerative approachmultivariate dependencies

0 comments

The pith

A tabular foundation model generates landslide datasets that match real distributions and preserve feature dependencies from sparse observations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Sparse and imbalanced landslide inventories hinder understanding of triggering factors like geology and hydrology. Existing data generation methods often fail to capture complex feature relationships or generalize across scenarios. This paper proposes using a tabular foundation model to synthesize new multi-feature datasets that retain the statistical properties and dependencies of limited real observations. Experiments across 20 landslide inventories confirm that the generated data aligns closely with observed patterns and remains robust in varied environments. This approach directly addresses data limitations to improve landslide susceptibility modeling and risk evaluation.

Core claim

By applying a tabular foundation model to limited landslide data, the generated datasets accurately reproduce the multivariate dependencies and statistical characteristics of real occurrences, as shown by close alignment with distributions in comparative tests on twenty inventories and consistent performance across different contexts.

What carries the argument

Tabular foundation model: a model trained on tabular data capable of learning from small samples to generate new instances while maintaining real-world feature interdependencies in landslide inventories.

If this is right

Landslide susceptibility models gain improved performance through training on the augmented datasets.
Risk assessment becomes feasible in areas lacking sufficient real observations.
Generated data supports more reliable analysis of triggering conditions across varied settings.
The approach extends applicability of susceptibility modeling to additional environmental contexts.
Overall predictive capabilities strengthen under conditions of data sparsity and imbalance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same generative method could apply to other natural hazards with similarly sparse observational records.
Integration of the generated data into hybrid physical-statistical models might enhance early-warning systems.
Widespread adoption could decrease dependence on extensive new field surveys for initial hazard mapping.

Load-bearing premise

The tabular foundation model accurately learns and reproduces the complex multivariate dependencies and statistical characteristics from limited landslide observations without introducing artifacts or biases.

What would settle it

A direct comparison where predictive models trained on the generated data show substantially lower accuracy than models trained on actual observations when tested on an independent landslide inventory would falsify the claim of alignment and robustness.

Figures

Figures reproduced from arXiv: 2604.25159 by Gang Mei, Jianbing Peng, Kaixuan Shao, Nengxiong Xu, Yinghan Wu.

**Figure 1.** Figure 1: Workflow of the proposed foundation model-based approach for overcoming landslide data sparsity and imbalance view at source ↗

**Figure 2.** Figure 2: Statistical characteristics of terrain and geomorphological features in rainfall-triggered landslide inventories generated in view at source ↗

**Figure 3.** Figure 3: Feature dependence of terrain and geomorphological characteristics in rainfall-triggered landslide inventories generated view at source ↗

**Figure 4.** Figure 4: Statistical characteristics of sparse local-scale and abundant global-scale rainfall-triggered landslide inventories view at source ↗

**Figure 5.** Figure 5: Comparative analysis of meteorological patterns in global-scale rainfall-triggered inventories generated by four ap view at source ↗

read the original abstract

Landslide investigation relies on sufficient and well-balanced observational data influenced by geological, hydrological, and anthropogenic factors. Available landslide inventories are often sparse and imbalanced, which limits understanding of triggering conditions and failure mechanisms. Data generation provides an effective approach to help capture feature dependencies from limited landslide observations. However, existing generation approaches for landslides often struggle to capture complex relationships among features and lack robustness across multiple scenarios and interacting factors. Here, we propose an accurate and robust approach for generating multi-feature landslide datasets by utilizing a tabular foundation model. By leveraging the capacity to learn from limited observations, the proposed approach effectively preserves the multivariate dependencies and statistical characteristics inherent in landslide occurrences. Comparative experiments on 20 landslide inventories demonstrate that the generated datasets closely align with observed distributions, maintain realistic feature dependencies, and exhibit robustness across different environmental contexts. This work provides an effective approach to overcome data sparsity and imbalance and strengthens landslide susceptibility modeling and risk assessment under limited observations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies a tabular foundation model to generate synthetic landslide data and validates it across 20 inventories, but the supporting details on dependency preservation and controls remain thin.

read the letter

The main takeaway is that the authors train a tabular foundation model on limited landslide observations and use it to produce synthetic datasets that they say match real distributions and feature dependencies, with tests run on 20 separate inventories from different settings. This gives the robustness claim some weight because the same pipeline is checked in varied contexts rather than one or two cases. The practical angle is clear: landslide inventories are often small and skewed, so a generation method that keeps geological and hydrological relationships intact could help susceptibility models where data is scarce. The experiments reportedly show the outputs align with held-out observed data, which is a reasonable way to check utility. What the work does well is the scale of the validation and the focus on a real constraint in geohazard work. It frames the issue directly and demonstrates that the generated samples can be used in downstream modeling without obvious collapse. The soft spots sit in the missing specifics. The description gives no architecture details, no training losses, no quantitative scores for how well multivariate dependencies are kept (such as correlation matrices or mutual information), and no error bars or ablation results against simpler baselines. Without those, it is hard to tell whether the foundation model is doing heavy lifting or whether basic resampling would produce similar alignment. The risk that generated samples introduce combinations that do not occur in real triggering conditions is also not addressed with targeted checks. This paper is aimed at researchers who build landslide susceptibility maps or work on other environmental datasets that suffer from sparsity. A reader who needs a concrete way to expand small inventories might extract usable ideas from the validation setup. It has enough empirical breadth to deserve peer review rather than a desk reject. A referee would likely ask for the quantitative metrics, clearer baselines, and any code or data to test reproducibility.

Referee Report

3 major / 2 minor

Summary. The paper proposes using a tabular foundation model to generate synthetic multi-feature landslide datasets from sparse and imbalanced observational inventories. It claims that the approach learns from limited data to preserve multivariate dependencies and statistical characteristics, with comparative experiments across 20 real landslide inventories showing close distributional alignment, realistic feature dependencies, and robustness across environmental contexts, thereby improving landslide susceptibility modeling and risk assessment.

Significance. If the results hold under rigorous quantitative validation, the work could meaningfully advance data augmentation techniques in geohazard modeling by demonstrating a foundation-model approach that outperforms prior generative methods on real-world sparse inventories. The multi-inventory experimental scope is a strength for assessing generalizability.

major comments (3)

[Abstract] Abstract: the central claim that generated datasets 'closely align with observed distributions' and 'maintain realistic feature dependencies' is stated without any quantitative metrics (e.g., Wasserstein distance, Pearson/Spearman correlations, or statistical tests for dependency preservation) or error bars; this evidentiary gap is load-bearing for the accuracy and robustness assertions.
[Method] Method section: no description is given of the tabular foundation model architecture, pre-training objectives, fine-tuning losses, or mechanisms for handling limited/imbalanced observations; these details are required to evaluate whether the model truly avoids artifacts or biases in triggering-condition preservation.
[Experiments] Experiments section: the comparative results on 20 inventories are summarized at a high level but lack baselines, ablation controls for generation artifacts, or cross-validation protocols; without these, the robustness claim across environmental contexts cannot be substantiated.

minor comments (2)

[Abstract] The abstract and title refer to 'a tabular foundation model' without naming the specific model or indicating whether it is off-the-shelf or custom; clarify this in the introduction for reproducibility.
[Figures/Tables] Figure captions and table legends should explicitly state the evaluation metrics used for distributional alignment and dependency preservation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We appreciate the opportunity to clarify aspects of our work and have prepared point-by-point responses to the major comments. Revisions will be made to address the evidentiary and methodological gaps identified.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that generated datasets 'closely align with observed distributions' and 'maintain realistic feature dependencies' is stated without any quantitative metrics (e.g., Wasserstein distance, Pearson/Spearman correlations, or statistical tests for dependency preservation) or error bars; this evidentiary gap is load-bearing for the accuracy and robustness assertions.

Authors: We agree that the abstract would be strengthened by the inclusion of quantitative support for these claims. In the revised manuscript, we will add specific metrics (Wasserstein distances for distributional alignment, Spearman correlations for dependency preservation, and references to statistical tests) along with brief indications of variability, while directing readers to the full quantitative results and error bars presented in the experiments section. revision: yes
Referee: [Method] Method section: no description is given of the tabular foundation model architecture, pre-training objectives, fine-tuning losses, or mechanisms for handling limited/imbalanced observations; these details are required to evaluate whether the model truly avoids artifacts or biases in triggering-condition preservation.

Authors: We acknowledge that the current method section provides a high-level description but omits the requested technical details. We will expand the section to fully specify the tabular foundation model architecture, pre-training objectives, fine-tuning losses, and the mechanisms used to handle sparse and imbalanced observations while preserving triggering conditions and avoiding artifacts. revision: yes
Referee: [Experiments] Experiments section: the comparative results on 20 inventories are summarized at a high level but lack baselines, ablation controls for generation artifacts, or cross-validation protocols; without these, the robustness claim across environmental contexts cannot be substantiated.

Authors: The experiments do report results across 20 inventories, yet we recognize that the absence of explicit baselines, ablation studies, and detailed cross-validation protocols limits the strength of the robustness claims. In the revision, we will incorporate standard generative baselines, ablation controls targeting generation artifacts, and a clear description of the cross-validation protocols employed to evaluate performance across environmental contexts. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's derivation chain consists of training a tabular foundation model on sparse landslide inventories to generate synthetic data, followed by direct empirical comparison of the generated distributions and feature dependencies against held-out observed data from 20 real inventories. No equations, fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations appear in the provided abstract or description. The central claim of alignment and robustness is supported by external validation against independent observations rather than reducing to the model's inputs by construction. This is the standard non-circular pattern for generative modeling papers that report held-out distributional metrics.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the unverified capacity of the tabular foundation model to extract and reproduce real feature dependencies from sparse data; this is treated as a domain assumption rather than demonstrated.

axioms (1)

domain assumption Tabular foundation models trained on limited observations can faithfully reproduce complex multivariate dependencies and statistical properties of landslide data
This premise is required for the generated data to be useful for downstream modeling and is invoked implicitly when claiming preservation of dependencies.

pith-pipeline@v0.9.0 · 5480 in / 1223 out tokens · 79764 ms · 2026-05-07T16:37:11.864492+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages

[1]

Landslide susceptibility mapping using machine learning: a literature survey

Ado, M., Amitab, K., Maji, A., et al., 2022. Landslide susceptibility mapping using machine learning: a literature survey. Remote Sensing 14, 3029

work page 2022
[2]

A new integrated approach for landslide data balancing and spatial prediction based on generative adversarial networks (gan)

Al-Najjar, H., Pradhan, B., Sarkar, et al., 2021. A new integrated approach for landslide data balancing and spatial prediction based on generative adversarial networks (gan). Remote Sensing 13, 4011

work page 2021
[3]

A hybrid intelligent system integrating the cascade forward neural network with elman neural network

Alkhasawneh, M., Tay, L., 2018. A hybrid intelligent system integrating the cascade forward neural network with elman neural network. Arab Journal of Science and Engineering 43, 6737–6749

work page 2018
[4]

A novel ensemble decision tree-based chi-squared automatic interaction detection (chaid) and multivariate logistic regression models in landslide suscepti- bility mapping

Althuwaynee, O., Pradhan, B., Park, H., Lee, J., 2014. A novel ensemble decision tree-based chi-squared automatic interaction detection (chaid) and multivariate logistic regression models in landslide suscepti- bility mapping. Landslides 11, 1063–1078

work page 2014
[5]

Deep learning-based landslide susceptibility mapping

Azarafza, M., Azarafza, M., Akgün, H., et al., 2021. Deep learning-based landslide susceptibility mapping. Scientific Reports 11, 24112

work page 2021
[6]

A., Dong, H., Gupta, J

Weyn, J. A., Dong, H., Gupta, J. K., Thambiratnam, K., Archibald, A. T., Wu, C.-C., Heider, E., Welling, M., Turner, R. E., Perdikaris, P., 2025. A foundation model for the earth system. Nature 641 (8065), 1180–1187

work page 2025
[7]

Bagging predictors

Breiman, L., 1996. Bagging predictors. Machine Learning 24, 123–140

work page 1996
[8]

V ., 2010

Chawla, N. V ., 2010. Data mining for imbalanced datasets: an overview. In: Data Mining and Knowledge Discovery Handbook. Springer US

work page 2010
[9]

Exploring the effect of absence selection on landslide susceptibility models: a case study in sicily, italy

Conoscenti, C., Rotigliano, E., Cama, M., Caraballo-Arias, N., Lombardo, L., Agnesi, V ., 2016. Exploring the effect of absence selection on landslide susceptibility models: a case study in sicily, italy. Geomor- phology 261, 222–235

work page 2016
[10]

Landslide susceptibility assessment based on an incomplete landslide inventory in the jilong valley, tibet, chinese himalayas

Du, J., Glade, T., Woldai, T., Chai, B., Zeng, B., 2020. Landslide susceptibility assessment based on an incomplete landslide inventory in the jilong valley, tibet, chinese himalayas. Engineering Geology 270, 105572. 28

work page 2020
[11]

Landslide susceptibility prediction based on positive unlabeled learning coupled with adaptive sampling

Fang, Z., Wang, Y ., Niu, R., Peng, L., 2021. Landslide susceptibility prediction based on positive unlabeled learning coupled with adaptive sampling. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14, 11581–11592

work page 2021
[12]

E., 1995

Freund, Y ., Schapire, R. E., 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In: Computational Learning Theory. Springer

work page 1995
[13]

A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches

Galar, M., Fernandez, A., Barrenechea, E., et al., 2012. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and

work page 2012
[14]

J., 2012

Glade, T., Anderson, M., Crozier, M. J., 2012. Landslide Hazard and Risk. John Wiley & Sons Ltd

work page 2012
[15]

Evaluating machine learning and statistical pre- diction techniques for landslide susceptibility modeling

Goetz, J., Brenning, A., Petschko, H., Leopold, P., 2015. Evaluating machine learning and statistical pre- diction techniques for landslide susceptibility modeling. Computers & Geosciences 81, 1–11

work page 2015
[16]

Gis-based evolution and comparisons of landslide susceptibility mapping of the east sikkim himalaya

Gupta, N., Pal, S., Das, J., 2022. Gis-based evolution and comparisons of landslide susceptibility mapping of the east sikkim himalaya. Annals of GIS 28 (3), 359–384

work page 2022
[17]

Data imbalance in landslide susceptibil- ity zonation: under-sampling for class-imbalance learning

Gupta, S., Jhunjhunwalla, M., Bhardwaj, A., Shukla, D., 2020. Data imbalance in landslide susceptibil- ity zonation: under-sampling for class-imbalance learning. In: ISPRS - International Archives of the

work page 2020
[18]

C., Cardinali, M., Fiorucci, F., Santangelo, M., Chang, K.-T., 2012

Guzzetti, F., Mondini, A. C., Cardinali, M., Fiorucci, F., Santangelo, M., Chang, K.-T., 2012. Landslide inventory maps: new tools for an old problem. Earth-Science Reviews 112, 42–66

work page 2012
[19]

Learning from class-imbalanced data: review of methods and applications

Haixiang, G., Yijing, L., Shang, J., et al., 2017. Learning from class-imbalanced data: review of methods and applications. Expert Systems with Applications 73, 220–239

work page 2017
[20]

A., 2009

He, H., Garcia, E. A., 2009. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21, 1263–1284

work page 2009
[21]

Accurate predictions on small data with a tabular foundation model

Hutter, F., 2025. Accurate predictions on small data with a tabular foundation model. Nature 637 (8045), 319–326. 29

work page 2025
[22]

Satellite remote sensing for global landslide monitoring

Hong, Y ., Adler, R., Huffman, G., 2007. Satellite remote sensing for global landslide monitoring. Eos (Washington DC) 88, 357–358

work page 2007
[23]

Huang, L., Luo, J., Lin, Z. e. a., 2020. Using deep learning to map retrogressive thaw slumps in the beiluhe region (tibetan plateau) from cubesat images. Remote Sensing of Environment 237, 111534

work page 2020
[24]

Different landslide sampling strategies in a grid-based bi-variate statistical susceptibility model

Hussin, H., Zumpano, V ., Reichenbach, P., Sterlacchini, S., Micu, M., van Westen, C., B˘alteanu, D., 2016. Different landslide sampling strategies in a grid-based bi-variate statistical susceptibility model. Geomor- phology 253, 508–523

work page 2016
[25]

Modeling landslide susceptibility in data-scarce environ- ments using optimized data mining and statistical methods

Lee, J., Sameen, M., Pradhan, B., Park, H., 2018. Modeling landslide susceptibility in data-scarce environ- ments using optimized data mining and statistical methods. Geomorphology 303, 284–298

work page 2018
[26]

Exploratory undersampling for class-imbalance learning

Liu, X.-Y ., Wu, J., Zhou, Z.-H., 2009. Exploratory undersampling for class-imbalance learning. IEEE Trans- actions on Systems, Man, and Cybernetics, Part B 39, 539–550

work page 2009
[27]

Machine learning for landslides prevention: a survey

Ma, Z., Mei, G., Piccialli, F., 2021. Machine learning for landslides prevention: a survey. Neural Computing and Applications 33, 10881–10907

work page 2021
[28]

Micheletti, N., Foresti, L., Robert, S. e. a., 2014. Machine learning feature selection methods for landslide susceptibility mapping. Mathematical Geosciences 46, 33–57

work page 2014
[29]

Coupling different methods for overcoming the class imbalance problem

Nanni, L., Fantozzi, C., Lazzarini, N., 2015. Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158, 48–61

work page 2015
[30]

Landslide susceptibility assessment by using convolutional neural network

Nikoobakht, S., Azarafza, M., Akgün, H., Derakhshani, R., 2022. Landslide susceptibility assessment by using convolutional neural network. Applied Sciences 12, 5992

work page 2022
[31]

Petschko, H., Brenning, A., Bell, R. e. a., 2014. Assessing the quality of landslide susceptibility maps - case study lower austria. Natural Hazards and Earth System Sciences 14, 95–118

work page 2014
[32]

Ensemble learning

Polikar, R., 2012. Ensemble learning. In: Ensemble Machine Learning. Springer, pp. 1–34

work page 2012
[33]

Systematic sample subdividing strategy for training landslide susceptibility models

Sameen, M., Pradhan, B., Bui, D., Alamri, A., 2020. Systematic sample subdividing strategy for training landslide susceptibility models. Catena 187, 104358. 30

work page 2020
[34]

Landslide susceptibility mapping based on weighted gradient boosting decision tree in wanzhou section of the three gorges reservoir area (china)

Song, Y ., Niu, R., Xu, S., et al., 2018. Landslide susceptibility mapping based on weighted gradient boosting decision tree in wanzhou section of the three gorges reservoir area (china). ISPRS International Journal of Geo-Information 8, 4

work page 2018
[35]

The influence of systematically incomplete shallow landslide inventories on statistical susceptibility models and suggestions for improvements

Steger, S., Brenning, A., Bell, R., Glade, T., 2016. The influence of systematically incomplete shallow landslide inventories on statistical susceptibility models and suggestions for improvements. Landslides 14, 1767–1781

work page 2016
[36]

Svms modeling for highly imbalanced classification

Tang, Y ., Zhang, Y ., Chawla, N., 2009. Svms modeling for highly imbalanced classification. IEEE Transac- tions on Systems, Man, and Cybernetics, Part B: Cybernetics 39, 281–288

work page 2009
[37]

E., Malamud, B

Taylor, F. E., Malamud, B. D., Witt, A., Guzzetti, F., 2018. Landslide shape, ellipticity and length-to-width ratios. Earth Surface Processes and Landforms 43, 3164–3189

work page 2018
[38]

Optimizing the predictive ability of machine learning methods for landslide susceptibility mapping using smote for lishui city in zhejiang province, china

Wang, Y ., Wu, X., Chen, Z., et al., 2019. Optimizing the predictive ability of machine learning methods for landslide susceptibility mapping using smote for lishui city in zhejiang province, china. International Journal of Environmental Research and Public Health 16, 368

work page 2019
[39]

Application of a two-step sampling strategy based on deep neural network for landslide susceptibility mapping

Yao, J., Qin, S., Qiao, S., et al., 2022. Application of a two-step sampling strategy based on deep neural network for landslide susceptibility mapping. Bulletin of Engineering Geology and the Environment 81, 148

work page 2022
[40]

Zhong, C., Liu, Y ., Gao, P. e. a., 2020. Landslide mapping with remote sensing: challenges and opportuni- ties. International Journal of Remote Sensing 41, 1555–1581

work page 2020
[41]

A similarity-based approach to sam- pling absence data for landslide susceptibility mapping using data-driven methods

Zhu, A., Miao, Y ., Liu, J., Bai, S., Zeng, C., Ma, T., Hong, H., 2019. A similarity-based approach to sam- pling absence data for landslide susceptibility mapping using data-driven methods. Catena 183, 104188

work page 2019

[1] [1]

Landslide susceptibility mapping using machine learning: a literature survey

Ado, M., Amitab, K., Maji, A., et al., 2022. Landslide susceptibility mapping using machine learning: a literature survey. Remote Sensing 14, 3029

work page 2022

[2] [2]

A new integrated approach for landslide data balancing and spatial prediction based on generative adversarial networks (gan)

Al-Najjar, H., Pradhan, B., Sarkar, et al., 2021. A new integrated approach for landslide data balancing and spatial prediction based on generative adversarial networks (gan). Remote Sensing 13, 4011

work page 2021

[3] [3]

A hybrid intelligent system integrating the cascade forward neural network with elman neural network

Alkhasawneh, M., Tay, L., 2018. A hybrid intelligent system integrating the cascade forward neural network with elman neural network. Arab Journal of Science and Engineering 43, 6737–6749

work page 2018

[4] [4]

A novel ensemble decision tree-based chi-squared automatic interaction detection (chaid) and multivariate logistic regression models in landslide suscepti- bility mapping

Althuwaynee, O., Pradhan, B., Park, H., Lee, J., 2014. A novel ensemble decision tree-based chi-squared automatic interaction detection (chaid) and multivariate logistic regression models in landslide suscepti- bility mapping. Landslides 11, 1063–1078

work page 2014

[5] [5]

Deep learning-based landslide susceptibility mapping

Azarafza, M., Azarafza, M., Akgün, H., et al., 2021. Deep learning-based landslide susceptibility mapping. Scientific Reports 11, 24112

work page 2021

[6] [6]

A., Dong, H., Gupta, J

Weyn, J. A., Dong, H., Gupta, J. K., Thambiratnam, K., Archibald, A. T., Wu, C.-C., Heider, E., Welling, M., Turner, R. E., Perdikaris, P., 2025. A foundation model for the earth system. Nature 641 (8065), 1180–1187

work page 2025

[7] [7]

Bagging predictors

Breiman, L., 1996. Bagging predictors. Machine Learning 24, 123–140

work page 1996

[8] [8]

V ., 2010

Chawla, N. V ., 2010. Data mining for imbalanced datasets: an overview. In: Data Mining and Knowledge Discovery Handbook. Springer US

work page 2010

[9] [9]

Exploring the effect of absence selection on landslide susceptibility models: a case study in sicily, italy

Conoscenti, C., Rotigliano, E., Cama, M., Caraballo-Arias, N., Lombardo, L., Agnesi, V ., 2016. Exploring the effect of absence selection on landslide susceptibility models: a case study in sicily, italy. Geomor- phology 261, 222–235

work page 2016

[10] [10]

Landslide susceptibility assessment based on an incomplete landslide inventory in the jilong valley, tibet, chinese himalayas

Du, J., Glade, T., Woldai, T., Chai, B., Zeng, B., 2020. Landslide susceptibility assessment based on an incomplete landslide inventory in the jilong valley, tibet, chinese himalayas. Engineering Geology 270, 105572. 28

work page 2020

[11] [11]

Landslide susceptibility prediction based on positive unlabeled learning coupled with adaptive sampling

Fang, Z., Wang, Y ., Niu, R., Peng, L., 2021. Landslide susceptibility prediction based on positive unlabeled learning coupled with adaptive sampling. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 14, 11581–11592

work page 2021

[12] [12]

E., 1995

Freund, Y ., Schapire, R. E., 1995. A decision-theoretic generalization of on-line learning and an application to boosting. In: Computational Learning Theory. Springer

work page 1995

[13] [13]

A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches

Galar, M., Fernandez, A., Barrenechea, E., et al., 2012. A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and

work page 2012

[14] [14]

J., 2012

Glade, T., Anderson, M., Crozier, M. J., 2012. Landslide Hazard and Risk. John Wiley & Sons Ltd

work page 2012

[15] [15]

Evaluating machine learning and statistical pre- diction techniques for landslide susceptibility modeling

Goetz, J., Brenning, A., Petschko, H., Leopold, P., 2015. Evaluating machine learning and statistical pre- diction techniques for landslide susceptibility modeling. Computers & Geosciences 81, 1–11

work page 2015

[16] [16]

Gis-based evolution and comparisons of landslide susceptibility mapping of the east sikkim himalaya

Gupta, N., Pal, S., Das, J., 2022. Gis-based evolution and comparisons of landslide susceptibility mapping of the east sikkim himalaya. Annals of GIS 28 (3), 359–384

work page 2022

[17] [17]

Data imbalance in landslide susceptibil- ity zonation: under-sampling for class-imbalance learning

Gupta, S., Jhunjhunwalla, M., Bhardwaj, A., Shukla, D., 2020. Data imbalance in landslide susceptibil- ity zonation: under-sampling for class-imbalance learning. In: ISPRS - International Archives of the

work page 2020

[18] [18]

C., Cardinali, M., Fiorucci, F., Santangelo, M., Chang, K.-T., 2012

Guzzetti, F., Mondini, A. C., Cardinali, M., Fiorucci, F., Santangelo, M., Chang, K.-T., 2012. Landslide inventory maps: new tools for an old problem. Earth-Science Reviews 112, 42–66

work page 2012

[19] [19]

Learning from class-imbalanced data: review of methods and applications

Haixiang, G., Yijing, L., Shang, J., et al., 2017. Learning from class-imbalanced data: review of methods and applications. Expert Systems with Applications 73, 220–239

work page 2017

[20] [20]

A., 2009

He, H., Garcia, E. A., 2009. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering 21, 1263–1284

work page 2009

[21] [21]

Accurate predictions on small data with a tabular foundation model

Hutter, F., 2025. Accurate predictions on small data with a tabular foundation model. Nature 637 (8045), 319–326. 29

work page 2025

[22] [22]

Satellite remote sensing for global landslide monitoring

Hong, Y ., Adler, R., Huffman, G., 2007. Satellite remote sensing for global landslide monitoring. Eos (Washington DC) 88, 357–358

work page 2007

[23] [23]

Huang, L., Luo, J., Lin, Z. e. a., 2020. Using deep learning to map retrogressive thaw slumps in the beiluhe region (tibetan plateau) from cubesat images. Remote Sensing of Environment 237, 111534

work page 2020

[24] [24]

Different landslide sampling strategies in a grid-based bi-variate statistical susceptibility model

Hussin, H., Zumpano, V ., Reichenbach, P., Sterlacchini, S., Micu, M., van Westen, C., B˘alteanu, D., 2016. Different landslide sampling strategies in a grid-based bi-variate statistical susceptibility model. Geomor- phology 253, 508–523

work page 2016

[25] [25]

Modeling landslide susceptibility in data-scarce environ- ments using optimized data mining and statistical methods

Lee, J., Sameen, M., Pradhan, B., Park, H., 2018. Modeling landslide susceptibility in data-scarce environ- ments using optimized data mining and statistical methods. Geomorphology 303, 284–298

work page 2018

[26] [26]

Exploratory undersampling for class-imbalance learning

Liu, X.-Y ., Wu, J., Zhou, Z.-H., 2009. Exploratory undersampling for class-imbalance learning. IEEE Trans- actions on Systems, Man, and Cybernetics, Part B 39, 539–550

work page 2009

[27] [27]

Machine learning for landslides prevention: a survey

Ma, Z., Mei, G., Piccialli, F., 2021. Machine learning for landslides prevention: a survey. Neural Computing and Applications 33, 10881–10907

work page 2021

[28] [28]

Micheletti, N., Foresti, L., Robert, S. e. a., 2014. Machine learning feature selection methods for landslide susceptibility mapping. Mathematical Geosciences 46, 33–57

work page 2014

[29] [29]

Coupling different methods for overcoming the class imbalance problem

Nanni, L., Fantozzi, C., Lazzarini, N., 2015. Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158, 48–61

work page 2015

[30] [30]

Landslide susceptibility assessment by using convolutional neural network

Nikoobakht, S., Azarafza, M., Akgün, H., Derakhshani, R., 2022. Landslide susceptibility assessment by using convolutional neural network. Applied Sciences 12, 5992

work page 2022

[31] [31]

Petschko, H., Brenning, A., Bell, R. e. a., 2014. Assessing the quality of landslide susceptibility maps - case study lower austria. Natural Hazards and Earth System Sciences 14, 95–118

work page 2014

[32] [32]

Ensemble learning

Polikar, R., 2012. Ensemble learning. In: Ensemble Machine Learning. Springer, pp. 1–34

work page 2012

[33] [33]

Systematic sample subdividing strategy for training landslide susceptibility models

Sameen, M., Pradhan, B., Bui, D., Alamri, A., 2020. Systematic sample subdividing strategy for training landslide susceptibility models. Catena 187, 104358. 30

work page 2020

[34] [34]

Landslide susceptibility mapping based on weighted gradient boosting decision tree in wanzhou section of the three gorges reservoir area (china)

Song, Y ., Niu, R., Xu, S., et al., 2018. Landslide susceptibility mapping based on weighted gradient boosting decision tree in wanzhou section of the three gorges reservoir area (china). ISPRS International Journal of Geo-Information 8, 4

work page 2018

[35] [35]

The influence of systematically incomplete shallow landslide inventories on statistical susceptibility models and suggestions for improvements

Steger, S., Brenning, A., Bell, R., Glade, T., 2016. The influence of systematically incomplete shallow landslide inventories on statistical susceptibility models and suggestions for improvements. Landslides 14, 1767–1781

work page 2016

[36] [36]

Svms modeling for highly imbalanced classification

Tang, Y ., Zhang, Y ., Chawla, N., 2009. Svms modeling for highly imbalanced classification. IEEE Transac- tions on Systems, Man, and Cybernetics, Part B: Cybernetics 39, 281–288

work page 2009

[37] [37]

E., Malamud, B

Taylor, F. E., Malamud, B. D., Witt, A., Guzzetti, F., 2018. Landslide shape, ellipticity and length-to-width ratios. Earth Surface Processes and Landforms 43, 3164–3189

work page 2018

[38] [38]

Optimizing the predictive ability of machine learning methods for landslide susceptibility mapping using smote for lishui city in zhejiang province, china

Wang, Y ., Wu, X., Chen, Z., et al., 2019. Optimizing the predictive ability of machine learning methods for landslide susceptibility mapping using smote for lishui city in zhejiang province, china. International Journal of Environmental Research and Public Health 16, 368

work page 2019

[39] [39]

Application of a two-step sampling strategy based on deep neural network for landslide susceptibility mapping

Yao, J., Qin, S., Qiao, S., et al., 2022. Application of a two-step sampling strategy based on deep neural network for landslide susceptibility mapping. Bulletin of Engineering Geology and the Environment 81, 148

work page 2022

[40] [40]

Zhong, C., Liu, Y ., Gao, P. e. a., 2020. Landslide mapping with remote sensing: challenges and opportuni- ties. International Journal of Remote Sensing 41, 1555–1581

work page 2020

[41] [41]

A similarity-based approach to sam- pling absence data for landslide susceptibility mapping using data-driven methods

Zhu, A., Miao, Y ., Liu, J., Bai, S., Zeng, C., Ma, T., Hong, H., 2019. A similarity-based approach to sam- pling absence data for landslide susceptibility mapping using data-driven methods. Catena 183, 104188

work page 2019