Nonparametric Copula Models for Multivariate, Mixed, and Missing Data

被引:0
|
作者
Feldman, Joseph [1 ]
Kowal, Daniel R. [2 ,3 ]
机构
[1] Duke Univ, Dept Stat Sci, Durham, NC 27708 USA
[2] Cornell Univ, Dept Stat & Data Sci, Ithaca, NY 14853 USA
[3] Rice Univ, Dept Stat, Houston, TX 77251 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Bayesian nonparametrics; Bayesian inference; Factor models; Imputation; Mixture models; BAYESIAN MIXTURE-MODELS; MULTIPLE IMPUTATION; INFERENCE; LIKELIHOOD;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Modern data sets commonly feature both substantial missingness and many variables of mixed data types, which present significant challenges for estimation and inference. Complete case analysis, which proceeds using only the observations with fully-observed variables, is often severely biased, while model-based imputation of missing values is limited by the ability of the model to capture complex dependencies among (possibly many) variables of mixed data types. To address these challenges, we develop a novel Bayesian mixture copula for joint and nonparametric modeling of multivariate count, continuous, ordinal, and unordered categorical variables, and deploy this model for inference, prediction, and imputation of missing data. Most uniquely, we introduce a new and computationally efficient strategy for marginal distribution estimation that eliminates the need to specify any marginal models yet delivers posterior consistency for each marginal distribution and the copula parameters under missingness-at-random. Extensive simulation studies demonstrate exceptional modeling and imputation capabilities relative to competing methods, especially with mixed data types, complex missingness mechanisms, and nonlinear dependencies. We conclude with a data analysis that highlights how improper treatment of missing data can distort a statistical analysis, and how the proposed approach offers a resolution.
引用
收藏
页码:1 / 50
页数:50
相关论文
共 50 条
  • [1] Factor copula models for multivariate data
    Krupskii, Pavel
    Joe, Harry
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2013, 120 : 85 - 101
  • [3] Factor copula models for mixed data
    Kadhem, Sayed H.
    Nikoloulopoulos, Aristidis K.
    [J]. BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2021, 74 (03): : 365 - 403
  • [4] A Nonparametric Test of Missing Completely at Random for Incomplete Multivariate Data
    Jun Li
    Yao Yu
    [J]. Psychometrika, 2015, 80 : 707 - 726
  • [5] A Nonparametric Test of Missing Completely at Random for Incomplete Multivariate Data
    Li, Jun
    Yu, Yao
    [J]. PSYCHOMETRIKA, 2015, 80 (03) : 707 - 726
  • [6] Learning causal structure from mixed data with missing values using Gaussian copula models
    Ruifei Cui
    Perry Groot
    Tom Heskes
    [J]. Statistics and Computing, 2019, 29 : 311 - 333
  • [7] Efficient estimation of multivariate semi-nonparametric GARCH filtered copula models
    Chen, Xiaohong
    Huang, Zhuo
    Yi, Yanping
    [J]. JOURNAL OF ECONOMETRICS, 2021, 222 (01) : 484 - 501
  • [8] Learning causal structure from mixed data with missing values using Gaussian copula models
    Cui, Ruifei
    Groot, Perry
    Heskes, Tom
    [J]. STATISTICS AND COMPUTING, 2019, 29 (02) : 311 - 333
  • [9] Semiparametric estimation of copula models with nonignorable missing data
    Guo, Feng
    Ma, Wei
    Wang, Lei
    [J]. JOURNAL OF NONPARAMETRIC STATISTICS, 2020, 32 (01) : 109 - 130
  • [10] A transition copula model for analyzing multivariate longitudinal data with missing responses
    Ahmadi, A.
    Baghfalaki, T.
    Ganjali, M.
    Kabir, A.
    Pazouki, A.
    [J]. JOURNAL OF APPLIED STATISTICS, 2022, 49 (12) : 3164 - 3177