Comparison of tree-based ensemble models for regression

被引:2
|
作者
Park, Sangho [1 ]
Kim, Chanmin [1 ,2 ]
机构
[1] Sungkyunkwan Univ, Dept Stat, Seoul, South Korea
[2] SungKyunKwan Univ, Dept Stat, 25-2 Seonggyungwan ro, Seoul 03063, South Korea
基金
新加坡国家研究基金会;
关键词
Bayesian additive regression trees; random forest; missingness; high-dimensional data; multicollinearity; MISSING DATA;
D O I
10.29220/CSAM.2022.29.5.561
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
When multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced. We compare the model structures and performances of various ensemble models for regression settings in this study. RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node. The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm. Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated. In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage. The BART outperforms in high dimensional, highly correlated data. However, in all of the scenarios considered, the RF has a shorter computation time. The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.
引用
收藏
页码:561 / 590
页数:30
相关论文
共 50 条
  • [1] Tree-Based Ensemble Models and Algorithms for Classification
    Tsiligaridis, J.
    [J]. 2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 103 - 106
  • [2] A note on the interpretation of tree-based regression models
    Gottard, Anna
    Vannucci, Giulia
    Marchetti, Giovanni Maria
    [J]. BIOMETRICAL JOURNAL, 2020, 62 (06) : 1564 - 1573
  • [3] Inductive learning of tree-based regression models
    Torgo, L
    [J]. AI COMMUNICATIONS, 2000, 13 (02) : 137 - 138
  • [4] Tree-Based Models for Fiting Stratified Linear Regression Models
    William D. Shannon
    Maciej Faifer
    Michael A. Province
    D. C. Rao
    [J]. Journal of Classification, 2002, 19 : 113 - 130
  • [5] Tree-based models for fitting stratified linear regression models
    Shannon, WD
    Faifer, M
    Province, MA
    Rao, DC
    [J]. JOURNAL OF CLASSIFICATION, 2002, 19 (01) : 113 - 130
  • [6] Tree-based ensemble methods for predicting PV power generation and their comparison with support vector regression
    Ahmad, Muhammad Waseem
    Mourshed, Monjur
    Rezgui, Yacine
    [J]. ENERGY, 2018, 164 : 465 - 474
  • [7] Variable Selection Issues in Tree-Based Regression Models
    Qin, Xiao
    Han, Junhee
    [J]. TRANSPORTATION RESEARCH RECORD, 2008, National Research Council (2061) : 30 - 38
  • [8] Regression tree-based diagnostics for linear multilevel models
    Simonoff, Jeffrey S.
    [J]. STATISTICAL MODELLING, 2013, 13 (5-6) : 459 - 480
  • [9] Comparison of regression tree-based methods in genomic selection
    Ashoori-Banaei, Sahar
    Ghafouri-Kesbi, Farhad
    Ahmadi, Ahmad
    [J]. JOURNAL OF GENETICS, 2021, 100 (02)
  • [10] Comparison of regression tree-based methods in genomic selection
    Sahar Ashoori-Banaei
    Farhad Ghafouri-Kesbi
    Ahmad Ahmadi
    [J]. Journal of Genetics, 2021, 100