Regression Trees and Ensemble for Multivariate Outcomes

被引:1
|
作者
Reynolds, Evan L. [1 ]
Callaghan, Brian C. [1 ]
Gaies, Michael [2 ]
Banerjee, Mousumi [3 ]
机构
[1] Univ Michigan, Dept Neurol, Ann Arbor, MI 48109 USA
[2] Univ Cincinnati, Dept Pediat, Ann Arbor, MI USA
[3] Univ Michigan, Dept Biostat, Ann Arbor, MI USA
基金
美国国家卫生研究院;
关键词
Multivariate outcomes; regression trees; Mahalanobis distance; clinical interpretability; machine learning; NEUROPATHY; PREVALENCE; OBESITY;
D O I
10.1007/s13571-023-00301-z
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Tree-based methods have become one of the most flexible, intuitive, and powerful analytic tools for exploring complex data structures. The best documented, and arguably most popular uses of tree-based methods are in biomedical research, where multivariate outcomes occur commonly (e.g. diastolic and systolic blood pressure and nerve conduction measures in studies of neuropathy). Existing tree-based methods for multivariate outcomes do not appropriately take into account the correlation that exists in such data. In this paper, we develop goodness-of-split measures for building multivariate regression trees for continuous multivariate outcomes. We propose two general approaches: minimizing within-node homogeneity and maximizing between-node separation. Within-node homogeneity is measured using the average Mahalanobis distance and the determinant of the variance-covariance matrix. Between-node separation is measured using the Mahalanobis distance, Euclidean distance and standardized Euclidean distance. To enhance prediction accuracy we extend the single multivariate regression tree to an ensemble of multivariate trees. Extensive simulations are presented to examine the properties of our goodness-of-split measures. Finally, the proposed methods are illustrated using two clinical datasets of neuropathy and pediatric cardiac surgery.
引用
收藏
页码:77 / 109
页数:33
相关论文
共 50 条
  • [21] The use of CART and multivariate regression trees for supervised and unsupervised feature selection
    Questier, F
    Put, R
    Coomans, D
    Walczak, B
    Heyden, YV
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 76 (01) : 45 - 54
  • [22] An Evolutionary Algorithm for Global Induction of Regression Trees with Multivariate Linear Models
    Czajkowski, Marcin
    Kretowski, Marek
    FOUNDATIONS OF INTELLIGENT SYSTEMS, 2011, 6804 : 230 - 239
  • [23] Efficiency comparisons in multivariate multiple regression with missing outcomes
    Rotnitzky, A
    Holcroft, CA
    Robins, JM
    JOURNAL OF MULTIVARIATE ANALYSIS, 1997, 61 (01) : 102 - 128
  • [24] Novel Ensemble of Multivariate Adaptive Regression Spline with Spatial Logistic Regression and Boosted Regression Tree for Gully Erosion Susceptibility
    Roy, Paramita
    Chandra Pal, Subodh
    Arabameri, Alireza
    Chakrabortty, Rabin
    Pradhan, Biswajeet
    Chowdhuri, Indrajit
    Lee, Saro
    Tien Bui, Dieu
    REMOTE SENSING, 2020, 12 (20) : 1 - 35
  • [25] A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality
    Austin, Peter C.
    STATISTICS IN MEDICINE, 2007, 26 (15) : 2937 - 2957
  • [26] Subspace Gaussian process regression model for ensemble nonlinear multivariate spectroscopic calibration
    Zheng, Junhua
    Gong, Yingkai
    Liu, Wei
    Zhou, Le
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2022, 230
  • [27] Single-Camera Automatic Landmarking for People Recognition with an Ensemble of Regression Trees
    Trejo, Karla
    Angulo, Cecilio
    COMPUTACION Y SISTEMAS, 2016, 20 (01): : 19 - 28
  • [28] Load Forecasting for a Campus University Using Ensemble Methods Based on Regression Trees
    Ruiz-Abellon, Maria del Carmen
    Gabaldon, Antonio
    Guillamon, Antonio
    ENERGIES, 2018, 11 (08):
  • [29] Facial Landmark Localization Using an Ensemble of Regression Trees on PC Game Controlling
    Kandimalla, Naren Sai Krishna
    Mishra, Soumya Ranjan
    Sanyal, Goutam
    Sarkar, Anirban
    INTELLIGENT COMPUTING AND COMMUNICATION, ICICC 2019, 2020, 1034 : 789 - 797
  • [30] MULTI-OBJECTIVE OPTIMIZATION OF ENSEMBLE OF REGRESSION TREES USING GENETIC ALGORITHMS
    Wan, Qian
    Pal, Ranadip
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 1356 - 1359