Concurrent generation of multivariate mixed data with variables of dissimilar types

被引:2
|
作者
Amatya, Anup [1 ]
Demirtas, Hakan [2 ]
机构
[1] New Mexico State Univ, Dept Publ Hlth Sci, 1335 Int Mall,RM 102, Las Cruces, NM 88011 USA
[2] Univ Illinois, Div Epidemiol & Biostat MC923, Chicago, IL USA
关键词
Generalized Poisson; mutivariate ordinal; discretization; CYSTITIS DATA-BASE; ORDINAL DATA; COUNT DATA; SIMULATION; DISTRIBUTIONS; MATRIX;
D O I
10.1080/00949655.2016.1177530
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data sets originating from wide range of research studies are composed of multiple variables that are correlated and of dissimilar types, primarily of count, binary/ordinal and continuous attributes. The present paper builds on the previous works on multivariate data generation and develops a framework for generating multivariate mixed data with a pre-specified correlation matrix. The generated data consist of components that are marginally count, binary, ordinal and continuous, where the count and continuous variables follow the generalized Poisson and normal distributions, respectively. The use of the generalized Poisson distribution provides a flexible mechanism which allows under- and over-dispersed count variables generally encountered in practice. A step-by-step algorithm is provided and its performance is evaluated using simulated and real-data scenarios.
引用
收藏
页码:3595 / 3607
页数:13
相关论文
共 50 条
  • [31] Multivariate Data Assimilation at a Partially Mixed Estuary
    Ardag, Dorukhan
    Wilson, Gregory
    Lerczak, James A.
    Winters, Dylan S.
    Peck-Richardson, Adam
    Lyons, Donald E.
    Orben, Rachael A.
    JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY, 2023, 40 (09) : 1007 - 1022
  • [32] Multivariate time series models for mixed data
    Debaly, Zinsou-Max
    Truquet, Lionel
    BERNOULLI, 2023, 29 (01) : 669 - 695
  • [33] Joint modelling of mixed outcome types using latent variables
    McCulloch, Charles
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2008, 17 (01) : 53 - 73
  • [34] A data transformation for the estimation of decay types of multivariate distributions
    Sobieczky, Florian
    Shahriari, Mostafa
    Freudenthaler, Bernhard
    INTERNATIONAL CONFERENCE ON INDUSTRY 4.0 AND SMART MANUFACTURING (ISM 2019), 2020, 42 : 524 - 527
  • [35] Multivariate autocorrelated processes: Data and shift generation
    Mastrangelo, CM
    Forrest, DR
    JOURNAL OF QUALITY TECHNOLOGY, 2002, 34 (02) : 216 - 220
  • [36] Gibbs sampling approach for generation of truncated multivariate Gaussian random variables
    Kotecha, JH
    Djuric, PM
    ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 1757 - 1760
  • [37] A FORTRAN PROGRAM FOR GENERATION OF MULTIVARIATE NORMALLY DISTRIBUTED RANDOM-VARIABLES
    GHOSH, A
    KULATILAKE, PHSW
    COMPUTERS & GEOSCIENCES, 1987, 13 (03) : 221 - 233
  • [38] A multivariate statistical framework for mixed storm types in compound flood analysis
    Maduwantha, Pravin
    Wahl, Thomas
    Santamaria-Aguilar, Sara
    Jane, Robert
    Booth, James F.
    Kim, Hanbeen
    Villarini, Gabriele
    NATURAL HAZARDS AND EARTH SYSTEM SCIENCES, 2024, 24 (11) : 4091 - 4107
  • [39] Joint analysis of multivariate failure time data with latent variables
    Pan, Deng
    Song, Xinyuan
    Pan, Junhao
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2022, 31 (07) : 1292 - 1312
  • [40] Compensating for the effects of unusual samples and variables in data for multivariate calibrations
    Brown, Steven
    Giglio, Cannon
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2016, 251