Random forests for classification in ecology

被引:3402
|
作者
Cutler, D. Richard [1 ]
Edwards, Thomas C., Jr.
Beard, Karen H.
Cutler, Adele
Hess, Kyle T.
机构
[1] Utah State Univ, Dept Math & Stat, Logan, UT 84322 USA
[2] Utah State Univ, Utah Cooperat Fish & Wildlife Res Unit, US Geol Survey, Logan, UT 84322 USA
[3] Utah State Univ, Dept Wildland Resources & Ecol Ctr, Logan, UT 84322 USA
[4] Utah State Univ, Dept Math & Stat, Logan, UT 84322 USA
[5] Utah State Univ, Dept Wildland Resources, Logan, UT 84322 USA
[6] Univ Washington, Coll Forest Resources, Seattle, WA 98195 USA
关键词
additive logistic regression; classification trees; LDA; logistic regression; machine learning; partial dependence plots; random forests; species distribution models;
D O I
10.1890/07-0539.1
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Classification procedures are some of the most widely used statistical methods in ecology. Random forests (RF) is a new and powerful statistical classifier that is well established in other disciplines but is relatively unknown in ecology. Advantages of RF compared to other statistical classifiers include (1) very high classification accuracy; (2) a novel method of determining variable importance; (3) ability to model complex interactions among predictor variables; (4) flexibility to perform several types of statistical data analysis, including regression, classification, survival analysis, and unsupervised learning; and (5) an algorithm for imputing missing values. We compared the accuracies of RF and four other commonly used statistical classifiers using data on invasive plant species presence in Lava Beds National Monument, California, USA, rare lichen species presence in the Pacific Northwest, USA, and nest sites for cavity nesting birds in the Uinta Mountains, Utah, USA. We observed high classification accuracy in all applications as measured by cross-validation and, in the case of the lichen data, by independent test data, when comparing RF to other common classification methods. We also observed that the variables that RF identified as most important for classifying invasive plant species coincided with expectations based on the literature.
引用
收藏
页码:2783 / 2792
页数:10
相关论文
共 50 条
  • [1] Ecology and classification of forests in Turkey
    Atalay, Ibrahim
    Efe, Recep
    Ozturk, Munir
    3RD INTERNATIONAL GEOGRAPHY SYMPOSIUM, GEOMED2013, 2014, 120 : 788 - 805
  • [2] Oxides Classification with Random Forests
    Xiao, Kai
    Chen, Baitong
    Bao, Wenzheng
    Cheng, Honglin
    INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2022, PT II, 2022, 13394 : 680 - 686
  • [3] Classification and interaction in random forests
    Denisko, Danielle
    Hoffman, Michael M.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2018, 115 (08) : 1690 - 1692
  • [4] Random forests for multiclass classification: Random MultiNomial Logit
    Prinzie, Anita
    Van den Poel, Dirk
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (03) : 1721 - 1732
  • [5] Ontological Random Forests for Image Classification
    Xu, Ning
    Wang, Jiangping
    Qi, Guojun
    Huang, Thomas
    Lin, Weiyao
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2015, 5 (03) : 61 - 74
  • [6] Classification Using Streaming Random Forests
    Abdulsalam, Hanady
    Skillicorn, David B.
    Martin, Patrick
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (01) : 22 - 36
  • [7] Random Forests for land cover classification
    Gislason, PO
    Benediktsson, JA
    Sveinsson, JR
    PATTERN RECOGNITION LETTERS, 2006, 27 (04) : 294 - 300
  • [8] Random forests for land cover classification
    Pal, M
    IGARSS 2003: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS I - VII, PROCEEDINGS: LEARNING FROM EARTH'S SHAPES AND SIZES, 2003, : 3510 - 3512
  • [9] Improving Classification Trustworthiness in Random Forests
    de Biase, Maria Stella
    Marulli, Fiammetta
    Verde, Laura
    Marrone, Stefano
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 563 - 568
  • [10] Random multiclass classification Generalizing random forests to random MNL and random NB
    Prinzie, Anita
    Van den Poel, Dirk
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2007, 4653 : 349 - +