Regression Trees and Ensemble for Multivariate Outcomes

被引:1
|
作者
Reynolds, Evan L. [1 ]
Callaghan, Brian C. [1 ]
Gaies, Michael [2 ]
Banerjee, Mousumi [3 ]
机构
[1] Univ Michigan, Dept Neurol, Ann Arbor, MI 48109 USA
[2] Univ Cincinnati, Dept Pediat, Ann Arbor, MI USA
[3] Univ Michigan, Dept Biostat, Ann Arbor, MI USA
基金
美国国家卫生研究院;
关键词
Multivariate outcomes; regression trees; Mahalanobis distance; clinical interpretability; machine learning; NEUROPATHY; PREVALENCE; OBESITY;
D O I
10.1007/s13571-023-00301-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Tree-based methods have become one of the most flexible, intuitive, and powerful analytic tools for exploring complex data structures. The best documented, and arguably most popular uses of tree-based methods are in biomedical research, where multivariate outcomes occur commonly (e.g. diastolic and systolic blood pressure and nerve conduction measures in studies of neuropathy). Existing tree-based methods for multivariate outcomes do not appropriately take into account the correlation that exists in such data. In this paper, we develop goodness-of-split measures for building multivariate regression trees for continuous multivariate outcomes. We propose two general approaches: minimizing within-node homogeneity and maximizing between-node separation. Within-node homogeneity is measured using the average Mahalanobis distance and the determinant of the variance-covariance matrix. Between-node separation is measured using the Mahalanobis distance, Euclidean distance and standardized Euclidean distance. To enhance prediction accuracy we extend the single multivariate regression tree to an ensemble of multivariate trees. Extensive simulations are presented to examine the properties of our goodness-of-split measures. Finally, the proposed methods are illustrated using two clinical datasets of neuropathy and pediatric cardiac surgery.
引用
收藏
页码:77 / 109
页数:33
相关论文
共 50 条
  • [31] BAMDT: Bayesian Additive Semi-Multivariate Decision Trees for Nonparametric Regression
    Luo, Zhao Tang
    Sang, Huiyan
    Mallick, Bani
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [32] Partial Least Squares Regression Trees for Multivariate Response Data With Multicollinear Predictors
    Yu, Wenxing
    Lee, Shin-Jae
    Cho, Hyungjun
    IEEE ACCESS, 2024, 12 : 36636 - 36644
  • [33] Multivariate regression trees: a new technique for modeling species-environment relationships
    De'Ath, G
    ECOLOGY, 2002, 83 (04) : 1105 - 1117
  • [34] Multivariate and regression models for directional data based on projected Pólya trees
    Nieto-Barajas L.E.
    Statistics and Computing, 2024, 34 (1)
  • [35] Transient stability assessment via decision trees and multivariate adaptive regression splines
    Rahmatian, Matin
    Chen, Yu Christine
    Palizban, Atefeh
    Moshref, Ali
    Dunford, William G.
    ELECTRIC POWER SYSTEMS RESEARCH, 2017, 142 : 320 - 328
  • [36] Drivers of genotype by environment interaction in radiata pine as indicated by multivariate regression trees
    Gapare, Washington J.
    Ivkovic, Milos
    Liepe, Katharina J.
    Hamann, Andreas
    Low, Charlie B.
    FOREST ECOLOGY AND MANAGEMENT, 2015, 353 : 21 - 29
  • [37] Clustering noisy data in a reduced dimension space via multivariate regression trees
    Smyth, C
    Coomans, D
    Everingham, Y
    PATTERN RECOGNITION, 2006, 39 (03) : 424 - 431
  • [38] The analysis of correlated binary outcomes using multivariate logistic regression
    Gauvreau, K
    Pagano, M
    BIOMETRICAL JOURNAL, 1997, 39 (03) : 309 - 325
  • [39] An Ensemble Method to Reconstruct Gene Regulatory Networks Based on Multivariate Adaptive Regression Splines
    Zheng, Ruiqing
    Li, Min
    Chen, Xiang
    Zhao, Siyu
    Wu, Fang-Xiang
    Pan, Yi
    Wang, Jianxin
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (01) : 347 - 354
  • [40] Ensemble Regression
    Unger, David A.
    van den Dool, Huug
    O'Lenic, Edward
    Collins, Dan
    MONTHLY WEATHER REVIEW, 2009, 137 (07) : 2365 - 2379