Comparison of feature selection methods for mapping soil organic matter in subtropical restored forests

被引:35
|
作者
Chen, Yang [1 ,2 ]
Ma, Lixia [1 ]
Yu, Dongsheng [1 ,2 ]
Zhang, Haidong [3 ]
Feng, Kaiyue [1 ,2 ]
Wang, Xin [1 ,2 ]
Song, Jie [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Soil Sci, State Key Lab Soil & Sustainable Agr, Nanjing 210008, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Suzhou Acad Agr Sci, Suzhou 215155, Peoples R China
基金
美国国家科学基金会;
关键词
Variable selection; Machine learning algorithms; Ensemble methods; Digital soil mapping; Forest soil organic matter; SPATIAL PREDICTION; VARIABLE SELECTION; CARBON; MOUNTAINS; EROSION; JIANGXI; MAP;
D O I
10.1016/j.ecolind.2022.108545
中图分类号
X176 [生物多样性保护];
学科分类号
090705 ;
摘要
Mapping Soil organic matter (SOM) over a complex forest landscape is challenging due to the difficulty in selecting the most insightful variables from high-dimensional datasets in the recent explosion of geospatial-data. Feature selection (FS) is necessary to reduce data redundancy and noise as well as to achieve more reliable SOM spatial predictions. However, it is still unclear that which is most effective among various FS methods in mapping SOM. Therefore, four types of FS approaches (i.e., filter, wrapper, embedded and ensemble) were adopted to generate optimum variable subsets from an original variable dataset of 60 candidates, respectively, for mapping SOM of restored forest land in a typical subtropical region of southern China. The most used methods for each type of FS approaches were selected in this study, including three filters (Chi-square, InfoGain and pearson correlation analysis), three wrappers (genetic algorithm, simulated annealing algorithm and support vector machine-recursive feature elimination) and three embedded methods (Boruta, random forest (RF) and extreme gradient boosting (XGBoost)), as well as an ensemble method (robust rank aggreg algorithm (RRA)). Meanwhile, the RF and XGBoost models were applied with a 10-fold cross-validation method to compare the relative advantages of the different FS methods in SOM mapping, by utilizing the correlation coefficients R2 between observed and predicted values and predicting errors of root mean square error (RMSE). The results show that the SOM prediction accuracies with optimized variable subsets generated by the different FS methods are better than those with full variables, yet the improvements of prediction performance are different among the four types of FS approaches. The ensemble method (RRA) is superior to the other three types of approaches with an average RMSE reduction of 9.16% comparing to that without using FS methods, followed by wrapper and embedded methods which obtained the average RMSE reduction by 7.81%, 7.32%, respectively, and the filter methods are the weakest in the RMSE reduction with slight decreases of 4.32%. The XGBoost model achieved a better performance in predicting SOM than the RF model regardless of input variables, and the XGBoost model combined with RRA FS method shows the greatest potential to map SOM in the restored forest land. This study provides a reference for obtaining more parsimonious and robust variable sets from the available big geo-data freely for soil mapping in other areas.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] COMPARISON FEATURE SELECTION METHODS FOR SUBTROPICAL VEGETATION CLASSIFICATION WITH HYPERSPECTRAL DATA
    Li, Qiaosi
    Wong, Frankie Kwan Kit
    Fung, Tung
    2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 3693 - 3696
  • [2] Effects of afforestation on soil organic matter characteristics under subtropical forests with low elevation
    Jien, Shih-Hao
    Chen, Tsai-Huei
    Chiu, Chih-Yu
    JOURNAL OF FOREST RESEARCH, 2011, 16 (04) : 275 - 283
  • [3] Labile carbon input and temperature effects on soil organic matter turnover in subtropical forests
    Li, Huan
    Liu, Guangli
    Luo, Haiping
    Zhang, Renduo
    ECOLOGICAL INDICATORS, 2022, 145
  • [4] Soil respiration related to the molecular composition of soil organic matter in subtropical and temperate forests under soil warming
    Liu, Yanchun
    Wang, Hui
    Schindlbacher, Andreas
    Liu, Shirong
    Yang, Yujing
    Tian, Huimin
    Chen, Lin
    Ming, Angang
    Wang, Jian
    Li, Jiachen
    Tian, Zuwei
    SOIL BIOLOGY & BIOCHEMISTRY, 2025, 201
  • [5] Response of Humic Acids and Soil Organic Matter to Vegetation Replacement in Subtropical High Mountain Forests
    Wang, Hsueh-Ching
    Tian, Guanglong
    Chen, Chiou-Pin
    Chang, Ed-Haun
    Chou, Chiao-Ying
    Chiou, Chyi-Rong
    Chiu, Chih-Yu
    JOURNAL OF GEOPHYSICAL RESEARCH-BIOGEOSCIENCES, 2019, 124 (12) : 3727 - 3736
  • [6] Soil organic matter quantity and quality shape microbial community compositions of subtropical broadleaved forests
    Ding, Junjun
    Zhang, Yuguang
    Wang, Mengmeng
    Sun, Xin
    Cong, Jing
    Deng, Ye
    Lu, Hui
    Yuan, Tong
    Van Nostrand, Joy D.
    Li, Diqiang
    Zhou, Jizhong
    Yang, Yunfeng
    MOLECULAR ECOLOGY, 2015, 24 (20) : 5175 - 5185
  • [7] Topography and Soil Organic Carbon in Subtropical Forests of China
    Zhou, Tao
    Lv, Yulong
    Xie, Binglou
    Xu, Lin
    Zhou, Yufeng
    Mei, Tingting
    Li, Yongfu
    Yuan, Ning
    Shi, Yongjun
    FORESTS, 2023, 14 (05):
  • [8] Comparison of soil organic matter in created, restored and paired natural wetlands in North Carolina
    Bruland G.L.
    Richardson C.J.
    Wetlands Ecology and Management, 2006, 14 (3) : 245 - 251
  • [9] Higher endogenous labile organic carbon decreases the temperature sensitivity of soil organic matter decomposition in two subtropical forests
    Ma, Di
    Sun, Yu
    Liu, Min
    Fang, Huajun
    Xu, Xingliang
    APPLIED SOIL ECOLOGY, 2025, 206
  • [10] The Role of Bedrock Geochemistry and Climate in Soil Organic Matter Stability in Subtropical Karst Forests of Southwest China
    Tang, Tiangang
    Hu, Peilei
    Zhang, Wei
    Xiao, Dan
    Tang, Li
    Xiao, Jun
    Zhao, Jie
    Wang, Kelin
    FORESTS, 2023, 14 (07):