Regression Trees and Ensemble for Multivariate Outcomes

被引:1
|
作者
Reynolds, Evan L. [1 ]
Callaghan, Brian C. [1 ]
Gaies, Michael [2 ]
Banerjee, Mousumi [3 ]
机构
[1] Univ Michigan, Dept Neurol, Ann Arbor, MI 48109 USA
[2] Univ Cincinnati, Dept Pediat, Ann Arbor, MI USA
[3] Univ Michigan, Dept Biostat, Ann Arbor, MI USA
基金
美国国家卫生研究院;
关键词
Multivariate outcomes; regression trees; Mahalanobis distance; clinical interpretability; machine learning; NEUROPATHY; PREVALENCE; OBESITY;
D O I
10.1007/s13571-023-00301-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Tree-based methods have become one of the most flexible, intuitive, and powerful analytic tools for exploring complex data structures. The best documented, and arguably most popular uses of tree-based methods are in biomedical research, where multivariate outcomes occur commonly (e.g. diastolic and systolic blood pressure and nerve conduction measures in studies of neuropathy). Existing tree-based methods for multivariate outcomes do not appropriately take into account the correlation that exists in such data. In this paper, we develop goodness-of-split measures for building multivariate regression trees for continuous multivariate outcomes. We propose two general approaches: minimizing within-node homogeneity and maximizing between-node separation. Within-node homogeneity is measured using the average Mahalanobis distance and the determinant of the variance-covariance matrix. Between-node separation is measured using the Mahalanobis distance, Euclidean distance and standardized Euclidean distance. To enhance prediction accuracy we extend the single multivariate regression tree to an ensemble of multivariate trees. Extensive simulations are presented to examine the properties of our goodness-of-split measures. Finally, the proposed methods are illustrated using two clinical datasets of neuropathy and pediatric cardiac surgery.
引用
收藏
页码:77 / 109
页数:33
相关论文
共 50 条
  • [41] A Deeply-Initialized Coarse-to-fine Ensemble of Regression Trees for Face Alignment
    Valle, Roberto
    Buenaposada, Jose M.
    Valdes, Antonio
    Baumela, Luis
    COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 609 - 624
  • [42] Ensemble-Trees: Leveraging Ensemble Power Inside Decision Trees
    Zimmermann, Albrecht
    DISCOVERY SCIENCE, PROCEEDINGS, 2008, 5255 : 76 - 87
  • [43] Extension of multivariate regression trees to interval data. Application to electricity load profiling
    Cariou, Veronique
    COMPUTATIONAL STATISTICS, 2006, 21 (02) : 325 - 341
  • [44] Software Effort Estimation Using Stacked Ensemble Technique and Hybrid Principal Component Regression and Multivariate Adaptive Regression Splines
    Varshini, A. G. Priya
    Kumari, K. Anitha
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 134 (04) : 2259 - 2278
  • [45] Extension of multivariate regression trees to interval data. Application to electricity load profiling
    Véronique Cariou
    Computational Statistics, 2006, 21 : 325 - 341
  • [46] Explainable Ensemble Trees
    Aria, Massimo
    Gnasso, Agostino
    Iorio, Carmela
    Pandolfo, Giuseppe
    COMPUTATIONAL STATISTICS, 2024, 39 (01) : 3 - 19
  • [47] A Heterogeneous Ensemble of Trees
    Cheng, Wen Xin
    Katuwal, Rakesh
    Suganthan, P. N.
    Qiu, Xueheng
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 1555 - 1560
  • [48] Ensemble of causal trees
    Bialas, P
    ACTA PHYSICA POLONICA B, 2003, 34 (10): : 4699 - 4710
  • [49] A Partially Linear Tree-based Regression Model for Multivariate Outcomes
    Yu, Kai
    Wheeler, William
    Li, Qizhai
    Bergen, Andrew W.
    Caporaso, Neil
    Chatterjee, Nilanjan
    Chen, Jinbo
    BIOMETRICS, 2010, 66 (01) : 89 - 96
  • [50] Explainable Ensemble Trees
    Massimo Aria
    Agostino Gnasso
    Carmela Iorio
    Giuseppe Pandolfo
    Computational Statistics, 2024, 39 : 3 - 19