Clustering and Prediction With Variable Dimension Covariates

被引:6
|
作者
Page, Garritt L. [1 ,2 ]
Quintana, Fernando A. [3 ,4 ]
Muller, Peter [5 ]
机构
[1] Brigham Young Univ, Dept Stat, Provo, UT 84602 USA
[2] BCAM Basque Ctr Appl Math, Bilbao, Spain
[3] Pontificia Univ Catolica Chile, Dept Estadist, Santiago, Chile
[4] Millennium Nucleus Ctr Discovery Struct Complex D, Santiago, Chile
[5] Univ Texas Austin, Dept Math, Austin, TX 78712 USA
基金
美国国家科学基金会;
关键词
Bayesian nonparametrics; Dependent random partition models; Indicator-missing; Pattern missing; MISSING-INDICATOR METHOD; MULTIPLE IMPUTATION; REGRESSION; VALUES; BART;
D O I
10.1080/10618600.2021.1999824
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many applied fields incomplete covariate vectors are commonly encountered. It is well known that this can be problematic when making inference on model parameters, but its impact on prediction performance is less understood. We develop a method based on covariate dependent random partition models that seamlessly handles missing covariates while completely avoiding any type of imputation. The method we develop allows in-sample as well as out-of-sample predictions, even if the missing pattern in the new subjects'incomplete covariate vectorwas not seen in the training data. Any data type, including categorical or continuous covariates are permitted. In simulation studies, the proposed method compares favorably. We illustrate themethod in two application examples. Supplementary materials for this article are available here.
引用
收藏
页码:466 / 476
页数:11
相关论文
共 50 条
  • [1] Regression with Variable Dimension Covariates
    Mueller, Peter
    Quintana, Fernando Andres
    Page, Garritt L.
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2024, 86 (SUPPL 1): : 185 - 198
  • [2] A Projection Approach to Local Regression with Variable-Dimension Covariates
    Heiner, Matthew J.
    Page, Garritt L.
    Quintana, Fernando Andres
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2025, 34 (01) : 109 - 122
  • [3] Dimension reduction for covariates in network data
    Zhao, Junlong
    Liu, Xiumin
    Wang, Hansheng
    Leng, Chenlei
    BIOMETRIKA, 2022, 109 (01) : 85 - 102
  • [4] Clustering Phenotype Trajectories with Genotype Covariates
    Greenlaw, Keelin
    Unternaehrer, Eva
    Greenwood, Celia
    Ciampi, Antonio
    GENETIC EPIDEMIOLOGY, 2016, 40 (07) : 639 - 639
  • [5] EXPLANATORY VARIABLE SELECTION WITH BALANCED CLUSTERING IN CUSTOMER CHURN PREDICTION
    Fridrich, Martin
    AD ALTA-JOURNAL OF INTERDISCIPLINARY RESEARCH, 2019, 9 (01): : 56 - 66
  • [6] Asymptotic properties of GEE with diverging dimension of covariates
    Zhu, Chunhua
    Gao, Qibing
    Yao, Yi
    RANDOM MATRICES-THEORY AND APPLICATIONS, 2023, 12 (02)
  • [7] Variable selection in regression with compositional covariates
    Lin, Wei
    Shi, Pixu
    Feng, Rui
    Li, Hongzhe
    BIOMETRIKA, 2014, 101 (04) : 785 - 797
  • [8] VC-PCR: A prediction method based on variable selection and clustering
    Marion, Rebecca
    Lederer, Johannes
    Goevarts, Bernadette
    von Sachs, Rainer
    STATISTICA NEERLANDICA, 2025, 79 (01)
  • [9] On sufficient dimension reduction for proportional censorship model with covariates
    Wen, Xuerong Meggie
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (08) : 1975 - 1982
  • [10] Dimension reduction in estimating equations with covariates missing at random
    Zhang, Ying
    Wang, Lei
    JOURNAL OF NONPARAMETRIC STATISTICS, 2018, 30 (02) : 491 - 504