Clustering and Prediction With Variable Dimension Covariates

被引:6
|
作者
Page, Garritt L. [1 ,2 ]
Quintana, Fernando A. [3 ,4 ]
Muller, Peter [5 ]
机构
[1] Brigham Young Univ, Dept Stat, Provo, UT 84602 USA
[2] BCAM Basque Ctr Appl Math, Bilbao, Spain
[3] Pontificia Univ Catolica Chile, Dept Estadist, Santiago, Chile
[4] Millennium Nucleus Ctr Discovery Struct Complex D, Santiago, Chile
[5] Univ Texas Austin, Dept Math, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Bayesian nonparametrics; Dependent random partition models; Indicator-missing; Pattern missing; MISSING-INDICATOR METHOD; MULTIPLE IMPUTATION; REGRESSION; VALUES; BART;
D O I
10.1080/10618600.2021.1999824
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many applied fields incomplete covariate vectors are commonly encountered. It is well known that this can be problematic when making inference on model parameters, but its impact on prediction performance is less understood. We develop a method based on covariate dependent random partition models that seamlessly handles missing covariates while completely avoiding any type of imputation. The method we develop allows in-sample as well as out-of-sample predictions, even if the missing pattern in the new subjects'incomplete covariate vectorwas not seen in the training data. Any data type, including categorical or continuous covariates are permitted. In simulation studies, the proposed method compares favorably. We illustrate themethod in two application examples. Supplementary materials for this article are available here.
引用
收藏
页码:466 / 476
页数:11
相关论文
共 50 条
  • [41] Discriminative geodesic Gaussian process latent variable model for structure preserving dimension reduction in clustering and classification problems
    Mahdi Heidari
    Mohammad Hossein Moattar
    Neural Computing and Applications, 2019, 31 : 3265 - 3278
  • [42] Discriminative geodesic Gaussian process latent variable model for structure preserving dimension reduction in clustering and classification problems
    Heidari, Mandi
    Moattar, Mohammad Hossein
    NEURAL COMPUTING & APPLICATIONS, 2019, 31 (08): : 3265 - 3278
  • [43] Instrumental variable based SEE variable selection for Poisson regression models with endogenous covariates
    Jiting Huang
    Peixin Zhao
    Xingshou Huang
    Journal of Applied Mathematics and Computing, 2019, 59 : 163 - 178
  • [44] Unsupervised clustering using fractal dimension
    Tasoulis, D. K.
    Vrahatis, M. N.
    INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2006, 16 (07): : 2073 - 2079
  • [45] Clustering and dimension reduction for mixed variables
    Vichi M.
    Vicari D.
    Kiers H.A.L.
    Behaviormetrika, 2019, 46 (2) : 243 - 269
  • [46] Knowledge Driven Dimension Reduction For Clustering
    Davidson, Ian
    21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1034 - 1039
  • [47] Asymptotics of hierarchical clustering for growing dimension
    Borysov, Petro
    Hannig, Jan
    Marron, J. S.
    JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 124 : 465 - 479
  • [48] ESTIMATING INTRINSIC DIMENSION VIA CLUSTERING
    Eriksson, Brian
    Crovella, Mark
    2012 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP (SSP), 2012, : 760 - 763
  • [49] On hierarchical clustering in sufficient dimension reduction
    Yoo, Chaeyeon
    Yoo, Younju
    Um, Hye Yeon
    Yoo, Jae Keun
    COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, 2020, 27 (04) : 431 - 443
  • [50] Exact hierarchical clustering in one dimension
    1600, Oxford University Press (250):