Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses

被引:117
|
作者
Singh-Blom, U. Martin [1 ,2 ]
Natarajan, Nagarajan [3 ]
Tewari, Ambuj [4 ]
Woods, John O. [1 ]
Dhillon, Inderjit S. [3 ]
Marcotte, Edward M. [1 ,5 ]
机构
[1] Univ Texas Austin, Ctr Syst & Synthet Biol, Inst Cellular & Mol Biol, Austin, TX 78712 USA
[2] Karolinska Inst, Dept Med, Stockholm, Sweden
[3] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[4] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
[5] Univ Texas Austin, Dept Chem & Biochem, Austin, TX 78712 USA
来源
PLOS ONE | 2013年 / 8卷 / 05期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
GENOME; DATABASE; PRIORITIZATION; IDENTIFICATION; INTEGRATION; PHENOTYPE; RESOURCE; BIOLOGY; WALKING; MODELS;
D O I
10.1371/journal.pone.0058977
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called CATAPULT (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas CATAPULT is better suited to correctly identifying gene-trait associations overall. The authors want to thank Jon Laurent and Kris McGary for some of the data used, and Li and Patra for making their code available. Most of Ambuj Tewari's contribution to this work happened while he was a postdoctoral fellow at the University of Texas at Austin.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Selection bias in meta-analyses of gene-disease associations
    Tang, JL
    PLOS MEDICINE, 2005, 2 (12): : 1226 - 1227
  • [2] Predicting gene-disease associations from the heterogeneous network using graph embedding
    Wang, Xiaochan
    Gong, Yuchong
    Yi, Jing
    Zhang, Wen
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 504 - 511
  • [3] Identifying gene-disease associations using centrality on a literature mined gene-interaction network
    Oezguer, Arzucan
    Vu, Thuy
    Erkan, Guenes
    Radev, Dragomir R.
    BIOINFORMATICS, 2008, 24 (13) : I277 - I285
  • [4] An empirical comparison of meta-analyses of published gene-disease associations versus consortium analyses
    Janssens, A. Cecile J. W.
    Ladd, Angela M. Gonzalez-Zuloeta
    Lopez-Leon, Sandra
    Ioannidis, John P. A.
    Oostra, Ben A.
    Khoury, Muin J.
    van Duijn, Cornelia M.
    GENETICS IN MEDICINE, 2009, 11 (03) : 153 - 162
  • [5] EFFECTIVE TESTING OF GENE-DISEASE ASSOCIATIONS
    SWIFT, M
    KUPPER, LL
    CHASE, CL
    AMERICAN JOURNAL OF HUMAN GENETICS, 1990, 47 (02) : 266 - 274
  • [6] Selection of SNPs for evaluating gene-disease associations using haplotypes
    Li, N
    Li, M
    GENETIC EPIDEMIOLOGY, 2005, 29 (03) : 263 - 263
  • [7] Finding directionality and gene-disease predictions in disease associations
    Garcia-Albornoz, Manuel
    Nielsen, Jens
    BMC SYSTEMS BIOLOGY, 2015, 9
  • [8] The Implicitome: A Resource for Rationalizing Gene-Disease Associations
    Hettne, Kristina M.
    Thompson, Mark
    van Haagen, Herman H. H. B. M.
    van der Horst, Eelke
    Kaliyaperumal, Rajaram
    Mina, Eleni
    Tatum, Zuotian
    Laros, Jeroen F. J.
    van Mulligen, Erik M.
    Schuemie, Martijn
    Aten, Emmelien
    Li, Tong Shu
    Bruskiewich, Richard
    Good, Benjamin M.
    Su, Andrew I.
    Kors, Jan A.
    den Dunnen, Johan
    van Ommen, Gert-Jan B.
    Roos, Marco
    't Hoen, Peter A. C.
    Mons, Barend
    Schultes, Erik A.
    PLOS ONE, 2016, 11 (02):
  • [9] Predicting Gene-Disease Associations with Manifold Learning
    Luo, Ping
    Tian, Li-Ping
    Chen, Bolin
    Xiao, Qianghua
    Wu, Fang-Xiang
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018, 2018, 10847 : 265 - 271
  • [10] Prediction of MicroRNA-Disease Associations Based on Social Network Analysis Methods
    Zou, Quan
    Li, Jinjin
    Hong, Qingqi
    Lin, Ziyu
    wu, Yun
    Shi, Hua
    Ju, Ying
    BIOMED RESEARCH INTERNATIONAL, 2015, 2015