Sample size effects on landslide susceptibility models: A comparative study of heuristic, statistical, machine learning, deep learning and ensemble learning models with SHAP analysis

被引：1

作者：

Yang, Shilong ^{[1
]}

Tan, Jiayao ^{[1
]}

Luo, Danyuan ^{[1
]}

Wang, Yuzhou ^{[2
,3
]}

Guo, Xu ^{[1
]}

Zhu, Qiuyu ^{[1
,4
]}

Ma, Chuanming ^{[1
]}

Xiong, Hanxiang ^{[1
]}

机构：

[1] China Univ Geosci, Sch Environm Studies, Wuhan 430074, Peoples R China

[2] Eastern Inst Technol, Eastern Inst Adv Study, Ningbo 315200, Peoples R China

[3] Shanghai Jiao Tong Univ, Sch Environm Sci & Engn, Shanghai 200240, Peoples R China

[4] Hangzhou Yuhang Urban Dev Investment Grp Co Ltd, Hangzhou 311100, Peoples R China

来源：

COMPUTERS & GEOSCIENCES | 2024年 / 193卷

关键词：

Landslide susceptibility assessment; Model robustness; Inventory sample size; XGBoost and LightGBM; Explainable machine learning; ANALYTICAL HIERARCHY PROCESS; FREQUENCY RATIO MODEL; LOGISTIC-REGRESSION; NEURAL-NETWORKS; GIS; AREA; HAZARD; PROVINCE; BASIN; INDEX;

D O I：

10.1016/j.cageo.2024.105723

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In landslide susceptibility assessment (LSA), inventory incompleteness impacts the accuracy of different models to varying degrees. However, this area remains under-researched. This study investigated six LSA models from heuristic, statistical, machine learning and ensemble learning models (analytical hierarchy process (AHP), frequency ratio (FR), logistic regression (LR), Keras based deep learning (KBDL), XGBoost, and LightGBM) across six different sample sizes (100%, 90%, 75%, 50%, 25%, and 10%). Results revealed that XGBoost and LightGBM consistently outperformed other models across all sample sizes. The LR and KBDL models followed, while FR model was the most affected by sample size variations. AHP, an empirical model, remained unaffected by sample size. Through SHapley Additive exPlanations (SHAP) analysis, elevation, NDVI, slope, land use, and distance to roads and rivers emerged as pivotal indicators for landslide occurrences in the study area, suggesting that human activities significantly influence these events. Five time-varying indicators regarding human activity and climate validated this inference, which provides a new method to identify landslide triggering factors, especially in areas of intense human activity. Based on the findings, a comprehensive framework for LSA is proposed to assist landslide managers in making informed decisions. Future research should focus on expanding model diversity to address the effects of sample size, enhancing the adaptability of the LSA framework, deepening the analysis of human activity impacts on landslides using explainable machine learning techniques, addressing temporal inventory incompleteness in LSA, and critically evaluating model sensitivity to sample size variations across multiple disciplines.

引用

页数：19

共 50 条

[21] Effects of non-landslide sampling strategies on machine learning models in landslide susceptibility mapping
Gu, Tengfei
Duan, Ping
Wang, Mingguo
Li, Jia
Zhang, Yanke
SCIENTIFIC REPORTS, 2024, 14 (01)
[22] A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins
Wang, Liyang
Niu, Dantong
Zhao, Xinjie
Wang, Xiaoya
Hao, Mengzhen
Che, Huilian
FOODS, 2021, 10 (04)
[23] A Comparative Study of Ensemble Deep Learning Models for Skin Cancer Detection
Kolachina, Srinivasa Kranthi Kiran
Agada, Ruth
Li, Wenting
2023 11TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, ICBCB, 2023, : 175 - 181
[24] Enhancing Question Pairs Identification with Ensemble Learning: Integrating Machine Learning and Deep Learning Models
Tarek, Salsabil
Noaman, Hatem M.
Kayed, Mohammed
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 981 - 992
[25] Sentiment analysis in multilingual context: Comparative analysis of machine learning and hybrid deep learning models
Das, Rajesh Kumar
Islam, Mirajul
Hasan, Md Mahmudul
Razia, Sultana
Hassan, Mocksidul
Khushbu, Sharun Akter
HELIYON, 2023, 9 (09)
[26] Novel ensemble machine learning models in flood susceptibility mapping
Prasad, Pankaj
Loveson, Victor Joseph
Das, Bappa
Kotha, Mahender
GEOCARTO INTERNATIONAL, 2022, 37 (16) : 4571 - 4593
[27] Landslide Susceptibility Mapping Methods Coupling with Statistical Methods, Machine Learning Models and Clustering Algorithms
Wang Q.
Xiong J.
Cheng W.
Cui X.
Pang Q.
Liu J.
Chen W.
Tang H.
Song N.
Journal of Geo-Information Science, 2024, 26 (03) : 620 - 637
[28] Evaluating the Performance of Individual and Novel Ensemble of Machine Learning and Statistical Models for Landslide Susceptibility Assessment at Rudraprayag District of Garhwal Himalaya
Saha, Sunil
Saha, Anik
Hembram, Tusar Kanti
Pradhan, Biswajeet
Alamri, Abdullah M.
APPLIED SCIENCES-BASEL, 2020, 10 (11):
[29] Generating a Landslide Susceptibility Map Using Integrated Meta-Heuristic Optimization and Machine Learning Models
Bostan, Tuba
SUSTAINABILITY, 2024, 16 (21)
[30] Correction: Evaluation of different machine learning models and novel deep learning-based algorithm for landslide susceptibility mapping
Tingyu Zhang
Yanan Li
Tao Wang
Huanyuan Wang
Tianqing Chen
Zenghui Sun
Dan Luo
Chao Li
Ling Han
Geoscience Letters, 10

← 1 2 3 4 5 →