Random forests for classification in ecology

被引:3409
|
作者
Cutler, D. Richard [1 ]
Edwards, Thomas C., Jr.
Beard, Karen H.
Cutler, Adele
Hess, Kyle T.
机构
[1] Utah State Univ, Dept Math & Stat, Logan, UT 84322 USA
[2] Utah State Univ, Utah Cooperat Fish & Wildlife Res Unit, US Geol Survey, Logan, UT 84322 USA
[3] Utah State Univ, Dept Wildland Resources & Ecol Ctr, Logan, UT 84322 USA
[4] Utah State Univ, Dept Math & Stat, Logan, UT 84322 USA
[5] Utah State Univ, Dept Wildland Resources, Logan, UT 84322 USA
[6] Univ Washington, Coll Forest Resources, Seattle, WA 98195 USA
关键词
additive logistic regression; classification trees; LDA; logistic regression; machine learning; partial dependence plots; random forests; species distribution models;
D O I
10.1890/07-0539.1
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature.
引用
收藏
页码:2783 / 2792
页数:10
相关论文
共 50 条
  • [31] Web Document Classification by Keywords Using Random Forests
    Klassen, Myungsook
    Paturi, Nikhila
    NETWORKED DIGITAL TECHNOLOGIES, PT 2, 2010, 88 : 256 - 261
  • [32] Baker's Cyst Classification Using Random Forests
    Ciszkiewicz, Adam
    Milewski, Grzegorz
    Lorkowski, Jacek
    PROCEEDINGS OF THE 2018 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2018, : 97 - 100
  • [33] On the Capability of Classification Trees and Random Forests to Estimate Probabilities
    Plante, Jean-Francois
    Radatz, Marisa
    JOURNAL OF STATISTICAL THEORY AND PRACTICE, 2024, 18 (02)
  • [34] Random Forests Classification Analysis for the Assessment of Diagnostic Skill
    Katz, James D.
    Mamyrova, Gulnara
    Guzhva, Olena
    Furmark, Lena
    AMERICAN JOURNAL OF MEDICAL QUALITY, 2010, 25 (02) : 149 - 153
  • [35] Pathway analysis using random forests classification and regression
    Pang, Herbert
    Lin, Aiping
    Holford, Matthew
    Enerson, Bradley E.
    Lu, Bin
    Lawton, Michael P.
    Floyd, Eugenia
    Zhao, Hongyu
    BIOINFORMATICS, 2006, 22 (16) : 2028 - 2036
  • [36] A Meta-Analysis of Research in Random Forests for Classification
    Pretorius, Arnu
    Bierman, Surette
    Steel, Sarel J.
    2016 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS INTERNATIONAL CONFERENCE (PRASA-ROBMECH), 2016,
  • [37] Towards Convergence Rate Analysis of Random Forests for Classification
    Gao, Wei
    Zhou, Zhi-Hua
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [38] Capsule Endoscopy Images Classification by Random Forests and Ferns
    Li, Baopu
    Zhou, Ran
    Yang, Can
    Meng, Max Q. -H.
    Xu, Guoqing
    Hu, Chao
    2014 4TH IEEE INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2014, : 414 - 417
  • [39] Towards convergence rate analysis of random forests for classification
    Gao W.
    Xu F.
    Zhou Z.-H.
    Artificial Intelligence, 2022, 313
  • [40] Classification of Immunosignature Using Random Forests for Cancer Diagnosis
    Zarzar, Mouayad
    Razak, Eliza
    Htike, Zaw Zaw
    Yusof, Faridah
    ADVANCED SCIENCE LETTERS, 2015, 21 (11) : 3449 - 3452