A genetic algorithm for simulating correlated binary data from biomedical research

被引:8
|
作者
Kruppa, Jochen [1 ]
Lepenies, Bernd [2 ,3 ]
Jung, Klaus [1 ,3 ]
机构
[1] Univ Vet Med Hannover, Inst Anim Breeding & Genet, Bunteweg 17p, D-30559 Hannover, Germany
[2] Univ Vet Med Hannover, Immunol Unit, Hannover, Germany
[3] Univ Vet Med Hannover, Res Ctr Emerging Infect & Zoonoses RIZ, Hannover, Germany
关键词
Correlated binary data; Genetic algorithm; High-dimensional data; Random number generation; Computer simulation; DISTRIBUTIONS; ASSOCIATION; VARIABLES; MODELS;
D O I
10.1016/j.compbiomed.2017.10.023
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correlated binary data arise in a large variety of biomedical research. In order to evaluate methods for their analysis, computer simulations of such data are often required. Existing methods can often not cover the full range of possible correlations between the variables or are not available as implemented software. We propose a genetic algorithm that approaches the desired correlation structure under a given marginal distribution. The procedure generates a large representative matrix from which the probabilities of individual observations can be derived or from which samples can be drawn directly. Our genetic algorithm is evaluated under different specified marginal frequencies and correlation structures, and is compared against two existing approaches. The evaluation checks the speed and precision of the approach as well as its suitability for generating also high-dimensional data. In an example of high-throughput glycan array data, we demonstrate the usability of our approach to simulate the power of global test procedures. An implementation of our own and two other methods were added to the R package `RepeatedHighDim'. The presented algorithm is not restricted to certain correlation structures. In contrast to existing methods it is also evaluated for high-dimensional data.
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
  • [21] SIGNIFICANCE TESTING FOR CORRELATED BINARY OUTCOME DATA
    ROSNER, B
    MILTON, RC
    BIOMETRICS, 1988, 44 (02) : 505 - 512
  • [22] Fusing of Binary Correlated Data with Unknown Statistics
    Ghobadzadeh, Ali
    Adve, Raviraj
    2017 IEEE RADAR CONFERENCE (RADARCONF), 2017, : 963 - 968
  • [23] An exact trend test for correlated binary data
    Corcoran, C
    Ryan, L
    Senchaudhuri, P
    Mehta, C
    Patel, N
    Molenberghs, G
    BIOMETRICS, 2001, 57 (03) : 941 - 948
  • [24] Methods for generating longitudinally correlated binary data
    Farrell, Patrick J.
    Rogers-Stewart, Katrina
    INTERNATIONAL STATISTICAL REVIEW, 2008, 76 (01) : 28 - 38
  • [25] Statistical Analysis for Correlated Binary Ophthalmologic Data
    Rosner, Bernard
    Ying, Gui-Shuang
    Glynn, Robert
    Maguire, Maureen G.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2017, 58 (08)
  • [26] Bayesian analysis of correlated misclassified binary data
    Paulino, CD
    Silva, G
    Achcar, JA
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 49 (04) : 1120 - 1131
  • [27] LOGISTIC-REGRESSION FOR CORRELATED BINARY DATA
    LECESSIE, S
    VANHOUWELINGEN, JC
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 1994, 43 (01) : 95 - 108
  • [28] Genetic studies and the Law of Biomedical Research
    Perez Segura, Pedro
    MEDICINA CLINICA, 2009, 132 (04): : 154 - 156
  • [29] Chemical and genetic sensors in biomedical research
    Achilefu, S
    Contag, CH
    Savitsky, AP
    Weissleder, R
    JOURNAL OF BIOMEDICAL OPTICS, 2005, 10 (04)
  • [30] Research and analysis of network data mining based on genetic algorithm
    Shi, Lei
    Zhao, Huiran
    Zhang, Kun
    MATERIAL SCIENCE, CIVIL ENGINEERING AND ARCHITECTURE SCIENCE, MECHANICAL ENGINEERING AND MANUFACTURING TECHNOLOGY II, 2014, 651-653 : 2181 - 2184