A genetic algorithm for simulating correlated binary data from biomedical research

被引:8
|
作者
Kruppa, Jochen [1 ]
Lepenies, Bernd [2 ,3 ]
Jung, Klaus [1 ,3 ]
机构
[1] Univ Vet Med Hannover, Inst Anim Breeding & Genet, Bunteweg 17p, D-30559 Hannover, Germany
[2] Univ Vet Med Hannover, Immunol Unit, Hannover, Germany
[3] Univ Vet Med Hannover, Res Ctr Emerging Infect & Zoonoses RIZ, Hannover, Germany
关键词
Correlated binary data; Genetic algorithm; High-dimensional data; Random number generation; Computer simulation; DISTRIBUTIONS; ASSOCIATION; VARIABLES; MODELS;
D O I
10.1016/j.compbiomed.2017.10.023
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correlated binary data arise in a large variety of biomedical research. In order to evaluate methods for their analysis, computer simulations of such data are often required. Existing methods can often not cover the full range of possible correlations between the variables or are not available as implemented software. We propose a genetic algorithm that approaches the desired correlation structure under a given marginal distribution. The procedure generates a large representative matrix from which the probabilities of individual observations can be derived or from which samples can be drawn directly. Our genetic algorithm is evaluated under different specified marginal frequencies and correlation structures, and is compared against two existing approaches. The evaluation checks the speed and precision of the approach as well as its suitability for generating also high-dimensional data. In an example of high-throughput glycan array data, we demonstrate the usability of our approach to simulate the power of global test procedures. An implementation of our own and two other methods were added to the R package `RepeatedHighDim'. The presented algorithm is not restricted to certain correlation structures. In contrast to existing methods it is also evaluated for high-dimensional data.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [31] Research on Classification of Data Mining Based Niche Genetic Algorithm
    Zhang, Beibei
    Zhu, Li
    Li, Yanli
    PROGRESS IN INTELLIGENCE COMPUTATION AND APPLICATIONS, 2008, : 197 - 199
  • [32] Test point selection method research based on genetic algorithm and binary particle swarm optimization algorithm
    Naval Aeronautical and Astronautical University, Yantai
    264001, China
    Lect. Notes Electr. Eng., (577-585):
  • [33] A comparison of methods for simulating correlated binary variables with specified marginal means and correlations
    Preisser, John S.
    Qaqish, Bahjat F.
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2014, 84 (11) : 2441 - 2452
  • [34] Simulating longer vectors of correlated binary random variables via multinomial sampling
    Shults, Justine
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 114 : 1 - 11
  • [35] Comparison of correlated proportions based on paired binary data from clustered samples
    Jin, Hua
    Lu, Ying
    JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2009, 139 (12) : 4206 - 4212
  • [36] A multilevel model for spatially correlated binary data in the presence of misclassification: an application in oral health research
    Mutsvari, Timothy
    Bandyopadhyay, Dipankar
    Declerck, Dominique
    Lesaffre, Emmanuel
    STATISTICS IN MEDICINE, 2013, 32 (30) : 5241 - 5259
  • [37] Biomedical Image Registration Using Genetic Algorithm
    Panda, Suraj
    Sarangi, Shubhendu Kumar
    Sarangi, Archana
    INTELLIGENT COMPUTING, COMMUNICATION AND DEVICES, 2015, 309 : 289 - 296
  • [38] Binary optics design with genetic algorithm
    Ji, Y
    Zhang, JJ
    Wang, JC
    INTERNATIONAL CONFERENCE ON HOLOGRAPHY AND OPTICAL INFORMATION PROCESSING (ICHOIP '96), 1996, 2866 : 116 - 119
  • [39] Generating spatially correlated fields with a genetic algorithm
    Pachepsky, Y
    Timlin, D
    COMPUTERS & GEOSCIENCES, 1998, 24 (08) : 765 - 769
  • [40] INBIOMED:: a platform for the integration and sharing of genetic, clinical and epidemiological data oriented to biomedical research
    López-Alonso, V
    Sanchez, JP
    Liebana, L
    Hermosilla, I
    Martín-Sánchez, F
    BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, : 222 - 226