Concurrent generation of multivariate mixed data with variables of dissimilar types

被引:2
|
作者
Amatya, Anup [1 ]
Demirtas, Hakan [2 ]
机构
[1] New Mexico State Univ, Dept Publ Hlth Sci, 1335 Int Mall,RM 102, Las Cruces, NM 88011 USA
[2] Univ Illinois, Div Epidemiol & Biostat MC923, Chicago, IL USA
关键词
Generalized Poisson; mutivariate ordinal; discretization; CYSTITIS DATA-BASE; ORDINAL DATA; COUNT DATA; SIMULATION; DISTRIBUTIONS; MATRIX;
D O I
10.1080/00949655.2016.1177530
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data sets originating from wide range of research studies are composed of multiple variables that are correlated and of dissimilar types, primarily of count, binary/ordinal and continuous attributes. The present paper builds on the previous works on multivariate data generation and develops a framework for generating multivariate mixed data with a pre-specified correlation matrix. The generated data consist of components that are marginally count, binary, ordinal and continuous, where the count and continuous variables follow the generalized Poisson and normal distributions, respectively. The use of the generalized Poisson distribution provides a flexible mechanism which allows under- and over-dispersed count variables generally encountered in practice. A step-by-step algorithm is provided and its performance is evaluated using simulated and real-data scenarios.
引用
收藏
页码:3595 / 3607
页数:13
相关论文
共 50 条
  • [1] Missing Data Imputation for a Multivariate Outcome of Mixed Variable Types
    Wang, Tuo
    Zilinskas, Rachel
    Li, Ying
    Qu, Yongming
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2023, 15 (04): : 826 - 837
  • [2] Simultaneous generation of multivariate mixed data with Poisson and normal marginals
    Amatya, Anup
    Demirtas, Hakan
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (15) : 3129 - 3139
  • [3] State-space models for multivariate longitudinal data of mixed types
    Jorgensen, B
    LundbyeChristensen, S
    Song, PXK
    Sun, L
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1996, 24 (03): : 385 - 402
  • [4] Reusable Concurrent Data Types
    Gramoli, Vincent
    Guerraoui, Rachid
    ECOOP 2014 - OBJECT-ORIENTED PROGRAMMING, 2014, 8586 : 182 - 206
  • [5] Multilevel models with multivariate mixed response types
    Goldstein, Harvey
    Carpenter, James
    Kenward, Michael G.
    Levin, Kate A.
    STATISTICAL MODELLING, 2009, 9 (03) : 173 - 197
  • [6] Review and evaluation of imputation methods for multivariate longitudinal data with mixed-type incomplete variables
    Cao, Yi
    Allore, Heather
    Vander Wyk, Brent
    Gutman, Roee
    STATISTICS IN MEDICINE, 2022, 41 (30) : 5844 - 5876
  • [7] On the generation of random multivariate data
    Camacho, Jose
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2017, 160 : 40 - 51
  • [8] CLUSTERING OF VARIABLES FOR MIXED DATA
    Saracco, J.
    Chavent, M.
    STATISTICS FOR ASTROPHYSICS: CLUSTERING AND CLASSIFICATION, 2016, 77 : 121 - 169
  • [9] A Copula-Based GLMM Model for Multivariate Longitudinal Data with Mixed-Types of Responses
    Zhang, Weiping
    Zhang, MengMeng
    Chen, Yu
    SANKHYA-SERIES B-APPLIED AND INTERDISCIPLINARY STATISTICS, 2020, 82 (02): : 353 - 379
  • [10] A Copula-Based GLMM Model for Multivariate Longitudinal Data with Mixed-Types of Responses
    Weiping Zhang
    MengMeng Zhang
    Yu Chen
    Sankhya B, 2020, 82 : 353 - 379