Investigating landslide data balancing for susceptibility mapping using generative and machine learning models

被引:0
|
作者
Jiang, Yuhang [1 ,2 ]
Wang, Wei [1 ,2 ]
Zou, Lifang [3 ]
Cao, Yajun [1 ,2 ]
Xie, Wei-Chau [4 ]
机构
[1] Hohai Univ, Geotech Res Inst, Nanjing 210098, Jiangsu, Peoples R China
[2] Hohai Univ, Key Lab Minist Educ Geomech & Embankment Engn, Nanjing 210098, Jiangsu, Peoples R China
[3] Hohai Univ, Sch Earth Sci & Engn, Nanjing 211100, Jiangsu, Peoples R China
[4] Univ Waterloo, Dept Civil & Environm Engn, 200 Univ Ave West, Waterloo, ON N2L 3G1, Canada
基金
中国国家自然科学基金;
关键词
Landslide susceptibility mapping; Conditional Tabular Generative Adversarial Networks; Convolutional Neural Network; Long Short-Term Memory Neural Network; Self-training semi-supervised SVM algorithm; NETWORK;
D O I
10.1007/s10346-024-02352-3
中图分类号
P5 [地质学];
学科分类号
0709 ; 081803 ;
摘要
With the development and application of machine learning, significant advances have been made in landslide susceptibility mapping. However, due to challenges in actual field landslide investigations, current landslide susceptibility mapping is usually characterized by insufficient landslide samples (positive samples) and low reliability of non-landslide samples (negative samples). Considering Lianghe County in Yunnan Province, China, as an example, this paper aims to research the effectiveness of three oversampling models in generating positive samples for landslides: Conditional Tabular Generative Adversarial Networks (CTGAN), Generative Adversarial Networks (GAN), and the traditional Synthetic Minority Oversampling Technique (SMOTE) algorithms. Additionally, three machine learning methods, including 1D Convolutional Neural Network-Long Short-Term Memory Neural Network (CNN-LSTM), Random Forest (RF), and Gradient Boosting Decision Tree (GBDT) classifiers, are used for landslide susceptibility assessment. We also devise a non-landslide data (negative samples) screening method utilizing a self-trained support vector machine within a semi-supervised framework. The results show that by training on the dataset after negative sample screening, the AUC values for the 1D-CNN-LSTM, RF, and GBDT models have shown significant improvement, increasing from (0.778, 0.869, 0.849) to (0.837, 0.936, 0.877). Compared with the original training set, the prediction accuracy of the three machine learning models is improved after training on the augmented data by CTGAN, GAN, and SMOTE models. The RF model, augmented with 200 positive samples generated by CTGAN, achieves the highest prediction accuracy in the study (AUC = 0.962). The 1D CNN-LSTM model achieves its highest prediction accuracy (AUC = 0.953) when augmented with 200 positive samples from GAN. Similarly, the GBDT model reaches its highest prediction accuracy (AUC = 0.928) when augmented with 200 positive samples created by SMOTE. In addition, the spatial distribution of data indicates that the data generated by the generative adversarial model exhibits higher diversity, which can be used for landslide susceptibility assessment.
引用
收藏
页码:189 / 204
页数:16
相关论文
共 50 条
  • [41] Comparing classical statistic and machine learning models in landslide susceptibility mapping in Ardanuc (Artvin), Turkey
    Akinci, Halil
    Zeybek, Mustafa
    NATURAL HAZARDS, 2021, 108 (02) : 1515 - 1543
  • [42] A Comparative Assessment of Machine Learning Models for Landslide Susceptibility Mapping in the Rugged Terrain of Northern Pakistan
    Shahzad, Naeem
    Ding, Xiaoli
    Abbas, Sawaid
    APPLIED SCIENCES-BASEL, 2022, 12 (05):
  • [43] Landslide Susceptibility Mapping Methods Coupling with Statistical Methods, Machine Learning Models and Clustering Algorithms
    Wang Q.
    Xiong J.
    Cheng W.
    Cui X.
    Pang Q.
    Liu J.
    Chen W.
    Tang H.
    Song N.
    Journal of Geo-Information Science, 2024, 26 (03) : 620 - 637
  • [44] A Novel Heterogeneous Ensemble Framework Based on Machine Learning Models for Shallow Landslide Susceptibility Mapping
    Tang, Haozhe
    Wang, Changming
    An, Silong
    Wang, Qingyu
    Jiang, Chenglin
    REMOTE SENSING, 2023, 15 (17)
  • [45] Comparing classical statistic and machine learning models in landslide susceptibility mapping in Ardanuc (Artvin), Turkey
    Halil Akinci
    Mustafa Zeybek
    Natural Hazards, 2021, 108 : 1515 - 1543
  • [46] Landslide Susceptibility Mapping of Chamoli (Uttarakhand) Using Random Forest Machine Learning Method
    Mittal, Amogh
    Gupta, Kunal
    Satyam, Neelima
    NATURAL GEO-DISASTERS AND RESILIENCY, CREST 2023, 2024, 445 : 207 - 217
  • [47] CONDITIONING FACTORS DETERMINATION FOR LANDSLIDE SUSCEPTIBILITY MAPPING USING SUPPORT VECTOR MACHINE LEARNING
    Kalantar, Bahareh
    Ueda, Naonori
    Lay, Usman Salihu
    Al-Najjar, Husam Abdulrasool H.
    Halin, Alfian Abdul
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 9626 - 9629
  • [48] Landslide susceptibility assessment and mapping using state-of-the art machine learning techniques
    Pourghasemi, Hamid Reza
    Sadhasivam, Nitheshnirmal
    Amiri, Mahdis
    Eskandari, Saeedeh
    Santosh, M.
    NATURAL HAZARDS, 2021, 108 (01) : 1291 - 1316
  • [49] Conditioning factors determination for mapping and prediction of landslide susceptibility using machine learning algorithms
    Al-Najjar, Husam A. H.
    Kalantar, Bahareh
    Pradhan, Biswjaeet
    Saeidi, Vahideh
    EARTH RESOURCES AND ENVIRONMENTAL REMOTE SENSING/GIS APPLICATIONS X, 2019, 11156
  • [50] Landslide susceptibility assessment and mapping using state-of-the art machine learning techniques
    Hamid Reza Pourghasemi
    Nitheshnirmal Sadhasivam
    Mahdis Amiri
    Saeedeh Eskandari
    M. Santosh
    Natural Hazards, 2021, 108 : 1291 - 1316