Concurrent generation of multivariate mixed data with variables of dissimilar types

被引:2
|
作者
Amatya, Anup [1 ]
Demirtas, Hakan [2 ]
机构
[1] New Mexico State Univ, Dept Publ Hlth Sci, 1335 Int Mall,RM 102, Las Cruces, NM 88011 USA
[2] Univ Illinois, Div Epidemiol & Biostat MC923, Chicago, IL USA
关键词
Generalized Poisson; mutivariate ordinal; discretization; CYSTITIS DATA-BASE; ORDINAL DATA; COUNT DATA; SIMULATION; DISTRIBUTIONS; MATRIX;
D O I
10.1080/00949655.2016.1177530
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data sets originating from wide range of research studies are composed of multiple variables that are correlated and of dissimilar types, primarily of count, binary/ordinal and continuous attributes. The present paper builds on the previous works on multivariate data generation and develops a framework for generating multivariate mixed data with a pre-specified correlation matrix. The generated data consist of components that are marginally count, binary, ordinal and continuous, where the count and continuous variables follow the generalized Poisson and normal distributions, respectively. The use of the generalized Poisson distribution provides a flexible mechanism which allows under- and over-dispersed count variables generally encountered in practice. A step-by-step algorithm is provided and its performance is evaluated using simulated and real-data scenarios.
引用
收藏
页码:3595 / 3607
页数:13
相关论文
共 50 条
  • [41] A forecasting method with efficient selection of variables in multivariate data sets
    Sagar P.
    Gupta P.
    Kashyap I.
    International Journal of Information Technology, 2021, 13 (3) : 1039 - 1046
  • [42] Formal specification of CSCW applications with concurrent abstract data types
    Frey, M
    Pucko, M
    JOURNAL OF SYSTEMS ARCHITECTURE, 1998, 44 (05) : 343 - 357
  • [43] CATAI: Concurrent algorithms and data types animation over the Internet
    Cattaneo, G
    Italiano, GF
    Ferraro-Petrillo, U
    JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2002, 13 (04): : 391 - 419
  • [44] Associated kernel discriminant analysis for multivariate mixed data
    Some, Sobom M.
    Kokonendji, Celestin C.
    Ibrahim, Mona
    ELECTRONIC JOURNAL OF APPLIED STATISTICAL ANALYSIS, 2016, 9 (02) : 385 - 399
  • [45] GViSOM for multivariate mixed data projection and structure visualization
    Hsu, Chung-Chian
    Wang, Kuo-Min
    Wang, Sheng-Hsuan
    2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 3300 - +
  • [46] Nonparametric Copula Models for Multivariate, Mixed, and Missing Data
    Feldman, Joseph
    Kowal, Daniel R.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 50
  • [47] Causal Inference on Multivariate and Mixed-Type Data
    Marx, Alexander
    Vreeken, Jilles
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT II, 2019, 11052 : 655 - 671
  • [48] Segmentation of Multivariate mixed data via lossy data coding and compression
    Ma, Yi
    Derksen, Harm
    Hong, Wei
    Wright, John
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2007, 29 (09) : 1546 - 1562
  • [49] Permutation testing for goodness-of-fit and stochastic ordering with multivariate mixed variables
    Arboretti, Rosa
    Ceccato, Riccardo
    Salmaso, Luigi
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2021, 91 (05) : 876 - 896
  • [50] Analysis of mixed data types: the unresolved problem
    Richards, JA
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING VII, 2002, 4541 : 122 - 133