Nonparametric Copula Models for Multivariate, Mixed, and Missing Data

被引:0
|
作者
Feldman, Joseph [1 ]
Kowal, Daniel R. [2 ,3 ]
机构
[1] Duke Univ, Dept Stat Sci, Durham, NC 27708 USA
[2] Cornell Univ, Dept Stat & Data Sci, Ithaca, NY 14853 USA
[3] Rice Univ, Dept Stat, Houston, TX 77251 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Bayesian nonparametrics; Bayesian inference; Factor models; Imputation; Mixture models; BAYESIAN MIXTURE-MODELS; MULTIPLE IMPUTATION; INFERENCE; LIKELIHOOD;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern data sets commonly feature both substantial missingness and many variables of mixed data types, which present significant challenges for estimation and inference. Complete case analysis, which proceeds using only the observations with fully-observed variables, is often severely biased, while model-based imputation of missing values is limited by the ability of the model to capture complex dependencies among (possibly many) variables of mixed data types. To address these challenges, we develop a novel Bayesian mixture copula for joint and nonparametric modeling of multivariate count, continuous, ordinal, and unordered categorical variables, and deploy this model for inference, prediction, and imputation of missing data. Most uniquely, we introduce a new and computationally efficient strategy for marginal distribution estimation that eliminates the need to specify any marginal models yet delivers posterior consistency for each marginal distribution and the copula parameters under missingness-at-random. Extensive simulation studies demonstrate exceptional modeling and imputation capabilities relative to competing methods, especially with mixed data types, complex missingness mechanisms, and nonlinear dependencies. We conclude with a data analysis that highlights how improper treatment of missing data can distort a statistical analysis, and how the proposed approach offers a resolution.
引用
收藏
页码:1 / 50
页数:50
相关论文
共 50 条
  • [21] Varying-association copula models for multivariate survival data
    Li, Hui
    Cao, Zhiqiang
    Yin, Guosheng
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2018, 46 (04): : 556 - 576
  • [22] Analysis of Multivariate Survival Data under Semiparametric Copula Models
    He, Wenqing
    Yi, Grace Y. Y.
    Yuan, Ao
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2024, 52 (02): : 380 - 413
  • [23] Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values
    Wan-Lun Wang
    [J]. TEST, 2019, 28 : 196 - 222
  • [25] Bayesian Gaussian Copula Factor Models for Mixed Data
    Murray, Jared S.
    Dunson, David B.
    Carin, Lawrence
    Lucas, Joseph E.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2013, 108 (502) : 656 - 665
  • [26] A nonparametric procedure for the two-factor mixed model with missing data
    Gao, Xin
    [J]. BIOMETRICAL JOURNAL, 2007, 49 (05) : 774 - 788
  • [27] Missing Data Imputation for a Multivariate Outcome of Mixed Variable Types
    Wang, Tuo
    Zilinskas, Rachel
    Li, Ying
    Qu, Yongming
    [J]. STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2023, 15 (04): : 826 - 837
  • [28] Latent variable mixed models with missing data
    Zare, N
    Ayatollahi, SMT
    Behboodian, J
    [J]. IRANIAN JOURNAL OF SCIENCE AND TECHNOLOGY, 2003, 27 (A2): : 407 - 416
  • [29] Flexible dynamic vine copula models for multivariate time series data
    Acar, Elif F.
    Czado, Claudia
    Lysy, Martin
    [J]. ECONOMETRICS AND STATISTICS, 2019, 12 : 181 - 197
  • [30] Multivariate multiple test procedures based on nonparametric copula estimation
    Neumann, Andre
    Bodnar, Taras
    Pfeifer, Dietmar
    Dickhaus, Thorsten
    [J]. BIOMETRICAL JOURNAL, 2019, 61 (01) : 40 - 61