Comparative performance assessment of landslide susceptibility models with presence-only, presence-absence, and pseudo-absence data

被引:11
|
作者
Zhao, Dong-mei [1 ]
Jiao, Yuan-mei [1 ]
Wang, Jin-liang [1 ]
Ding, Yin-ping [1 ]
Liu, Zhi-lin [1 ]
Liu, Cheng-jing [1 ]
Qiu, Ying-mei [1 ]
Zhang, Juan [1 ]
Xu, Qiu-e [1 ]
Wu, Chang-run [1 ]
机构
[1] Yunnan Normal Univ, Sch Tourism & Geog Sci, 768 Juxian, Kunming 650500, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Landslide susceptibility mapping; Presence-only data; Presence-absence data; Pseudoabsence data; ROC-AUC; DATA-MINING TECHNIQUES; LOGISTIC-REGRESSION; RANDOM FOREST; SPECIES DISTRIBUTIONS; DECISION-TREE; SPATIAL PREDICTION; ENTROPY; GIS; CLASSIFICATION; MOUNTAINS;
D O I
10.1007/s11629-020-6277-y
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
The quality of the data for statistical methods plays an important role in landslide susceptibility mapping. How different data types influence the performance of landslide susceptibility maps is worth studying. The goal of this study was to explore the effects of different data types namely, presence-only (PO), presence-absence (PA), and pseudo-absence (PAs) data, on the predictive capability of landslide susceptibility mapping. This was completed by conducting a case study in the landslide-prone Honghe County in the Yunnan Province of China. A total of 428 landslide PO data points were selected. An equivalent number of non-landslide locations were generated as PA data by random sampling, and 10,000 sites were uniformly selected at random from each region as PAs data. Three landslide susceptibility models, namely the information value model (IVM), logistic regression (LR) model, and maximum entropy (MaxEnt) model, corresponding to the three data types were investigated. Additionally, the area under the receiver operating characteristic curves (ROC-AUC), seven statistical indices (i.e. accuracy, sensibility, false-positive rate, specificity, precision, Kappa, and F-measure), and a landslide density analysis were used to evaluate model performance regarding landslide susceptibility mapping. Our results indicated that the MaxEnt model using PAs data performed the best and had the highest fitness with the highest ROC-AUC values and statistical indices, followed by the IVM model with only landslide data (PO), and the LR model using PA data. Using PAs data avoided the inherent over-predictive shortcomings of PO data by limiting the predicted area of high-landslide susceptibility. Additionally, the random sampling design of landslide PA data increased the uncertainty of landslide susceptibility mapping and influenced the performance of the model. Therefore, our results suggested that the PAs data sampling provided a useful data type in the absence of high-quality data. Finally, we summarized the principles, advantages, and disadvantages of the three data types to assist with model optimization and the improvement of predicted performance and fitness.
引用
收藏
页码:2961 / 2981
页数:21
相关论文
共 50 条
  • [1] Comparative performance assessment of landslide susceptibility models with presence-only, presence-absence, and pseudo-absence data
    Dong-mei Zhao
    Yuan-mei Jiao
    Jin-liang Wang
    Yin-ping Ding
    Zhi-lin Liu
    Cheng-jing Liu
    Ying-mei Qiu
    Juan Zhang
    Qiu-e Xu
    Chang-run Wu
    Journal of Mountain Science, 2020, 17 : 2961 - 2981
  • [2] Comparative performance assessment of landslide susceptibility models with presence-only,presenceabsence,and pseudo-absence data
    ZHAO Dong-mei
    JIAO Yuan-mei
    WANG Jin-liang
    DING Yin-ping
    LIU Zhi-lin
    LIU Cheng-jing
    QIU Ying-mei
    ZHANG Juan
    XU Qiu-e
    WU Chang-run
    Journal of Mountain Science, 2020, 17 (12) : 2961 - 2981
  • [3] Comparison of the presence-only method and presence-absence method in landslide susceptibility mapping
    Zhu, A-Xing
    Miao, Yamin
    Yang, Lin
    Bai, Shibiao
    Liu, Junzhi
    Hong, Haoyuan
    CATENA, 2018, 171 : 222 - 233
  • [4] POISSON POINT PROCESS MODELS SOLVE THE "PSEUDO-ABSENCE PROBLEM" FOR PRESENCE-ONLY DATA IN ECOLOGY
    Warton, David I.
    Shepherd, Leah C.
    ANNALS OF APPLIED STATISTICS, 2010, 4 (03): : 1383 - 1402
  • [5] Sample selection bias and presence-only distribution models: implications for background and pseudo-absence data
    Phillips, Steven J.
    Dudik, Miroslav
    Elith, Jane
    Graham, Catherine H.
    Lehmann, Anthony
    Leathwick, John
    Ferrier, Simon
    ECOLOGICAL APPLICATIONS, 2009, 19 (01) : 181 - 197
  • [6] Pattern-recognition ecological niche models fit to presence-only and presence-absence data
    Maher, Sean P.
    Randin, Christophe F.
    Guisan, Antoine
    Drake, John M.
    METHODS IN ECOLOGY AND EVOLUTION, 2014, 5 (08): : 761 - 770
  • [7] PRESENCE-ONLY AND PRESENCE-ABSENCE DATA FOR COMPARING SPECIES DISTRIBUTION MODELING METHODS
    Elith, Jane
    Graham, Catherine
    Valavi, Roozbeh
    Abegg, Meinrad
    Bruce, Caroline
    Ford, Andrew
    Guisan, Antoine
    Hijmans, Robert J.
    Huettmann, Falk
    Lohmann, Lucia
    Loiselle, Bette
    Moritz, Craig
    Overton, Jake
    Peterson, A. Townsend
    Phillips, Steven
    Richardson, Karen
    Williams, Stephen
    Wiser, Susan K.
    Wohlgemuth, Thomas
    Zimmermann, Niklaus E.
    Ferrier, Simon
    BIODIVERSITY INFORMATICS, 2020, 15 (02) : 69 - 80
  • [8] Presence-only versus presence-absence data in species composition determinant analyses
    Kent, Rafi
    Carmel, Yohay
    DIVERSITY AND DISTRIBUTIONS, 2011, 17 (03) : 474 - 479
  • [9] Preferential sampling for presence/absence data and for fusion of presence/absence data with presence-only data
    Gelfand, Alan E.
    Shirota, Shinichiro
    ECOLOGICAL MONOGRAPHS, 2019, 89 (03)
  • [10] Contraction in the range of Malleefowl (Leipoa ocellata) in Western Australia:: a comparative assessment using presence-only and presence-absence datasets
    Parsons, Blair C.
    Short, Jeff C.
    Roberts, J. Dale
    EMU-AUSTRAL ORNITHOLOGY, 2008, 108 (03): : 221 - 231