Sensible Initialization Using Expert Knowledge for Genome-Wide Analysis of Epistasis Using Genetic Programming

被引:6
|
作者
Greene, Casey S. [1 ]
White, Bill C. [1 ]
Moore, Jason H. [1 ]
机构
[1] Dartmouth Med Sch, Dept Genet, Lebanon, NH USA
关键词
ASSOCIATION; SUSCEPTIBILITY; RELIEFF; CANCER;
D O I
10.1109/CEC.2009.4983093
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For biomedical researchers it is now possible to measure large numbers of DNA sequence variations across the human genome. Measuring hundreds of thousands of variations is now routine, but single variations which consistently predict an individual's risk of common human disease have proven elusive. Instead of single variants determining the risk of common human diseases, it seems more likely that disease risk is best modeled by interactions between biological components. The evolutionary computing challenge now is to effectively explore interactions in these large datasets and identify combinations of variations which are robust predictors of common human diseases such as bladder cancer. One promising approach to this problem is genetic programming (GP). A GP approach for this problem will use darwinian inspired evolution to evolve programs which find and model attribute interactions which predict an individual's risk of common human diseases. The goal of this study is to develop and evaluate two initializers for this domain. We develop a probabilistic initializer which uses expert knowledge to select attributes and an enumerative initializer which maximizes attribute diversity in the generated population. We compare these initializers to a random initializer which displays no preference for attributes. We show that the expert-knowledge-aware probabilistic initializer significantly outperforms both the random initializer and the enumerative initializer. We discuss implications of these results for the design of GP strategies which are able to detect and characterize predictors of common human diseases.
引用
收藏
页码:1289 / 1296
页数:8
相关论文
共 50 条
  • [21] Analysis of genome-wide association study data using the protein knowledge base
    Sara Ballouz
    Jason Y Liu
    Martin Oti
    Bruno Gaeta
    Diane Fatkin
    Melanie Bahlo
    Merridee A Wouters
    BMC Genetics, 12
  • [22] Analysis of genome-wide association study data using the protein knowledge base
    Ballouz, Sara
    Liu, Jason Y.
    Oti, Martin
    Gaeta, Bruno
    Fatkin, Diane
    Bahlo, Melanie
    Wouters, Merridee A.
    BMC GENETICS, 2011, 12
  • [23] Analysis of East Asia Genetic Substructure Using Genome-Wide SNP Arrays
    Tian, Chao
    Kosoy, Roman
    Lee, Annette
    Ransom, Michael
    Belmont, John W.
    Gregersen, Peter K.
    Seldin, Michael F.
    PLOS ONE, 2008, 3 (12):
  • [24] A genome-wide epistasis analysis method based on multiple criteria fusion
    Chen, Min (9918428@qq.com), 2016, Hunan University (43):
  • [25] Clustering by genetic ancestry using genome-wide SNP data
    Nadia Solovieff
    Stephen W Hartley
    Clinton T Baldwin
    Thomas T Perls
    Martin H Steinberg
    Paola Sebastiani
    BMC Genetics, 11
  • [26] Clustering by genetic ancestry using genome-wide SNP data
    Solovieff, Nadia
    Hartley, Stephen W.
    Baldwin, Clinton T.
    Perls, Thomas T.
    Steinberg, Martin H.
    Sebastiani, Paola
    BMC GENETICS, 2010, 11
  • [27] Identifying in vivo pathways using genome-wide genetic networks
    Gray, J. V.
    Krause, S. A.
    BIOCHEMICAL SOCIETY TRANSACTIONS, 2007, 35 : 1538 - 1541
  • [28] Molecular genetic analysis of retinitis pigmentosa in Indonesia using genome-wide homozygosity mapping
    Siemiatkowska, Anna M.
    Arimadyo, Kentar
    Moruz, Luminita M.
    Astuti, Galuh D. N.
    de Castro-Miro, Marta
    Zonneveld, Marijke N.
    Strom, Tim M.
    de Wijs, Ilse J.
    Hoefsloot, Lies H.
    Faradz, Sultana M. H.
    Cremers, Frans P. M.
    den Hollander, Anneke I.
    Collin, Rob W. J.
    MOLECULAR VISION, 2011, 17 (325-26): : 3013 - 3024
  • [29] A method for detecting epistasis in genome-wide studies using case-control multi-locus association analysis
    Gayan, Javier
    Gonzalez-Perez, Antonio
    Bermudo, Fernando
    Saez, Maria Eugenia
    Royo, Jose Luis
    Quintas, Antonio
    Galan, Jose Jorge
    Moron, Francisco Jesus
    Ramirez-Lorca, Reposo
    Real, Luis Miguel
    Ruiz, Agustin
    BMC GENOMICS, 2008, 9 (1)
  • [30] A method for detecting epistasis in genome-wide studies using case-control multi-locus association analysis
    Javier Gayán
    Antonio González-Pérez
    Fernando Bermudo
    María Eugenia Sáez
    Jose Luis Royo
    Antonio Quintas
    Jose Jorge Galan
    Francisco Jesús Morón
    Reposo Ramirez-Lorca
    Luis Miguel Real
    Agustín Ruiz
    BMC Genomics, 9