Identifying gene-disease associations using centrality on a literature mined gene-interaction network

被引:233
|
作者
Oezguer, Arzucan [1 ]
Vu, Thuy [1 ]
Erkan, Guenes [1 ]
Radev, Dragomir R. [1 ,2 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Sch Informat, Ann Arbor, MI 48109 USA
关键词
D O I
10.1093/bioinformatics/btn182
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network. Results: The proposed approach can be used to extract known and to infer unknown gene-disease associations. We evaluated the approach for prostate cancer. Eigenvector and degree centrality achieved high accuracy. A total of 95% of the top 20 genes ranked by these methods are confirmed to be related to prostate cancer. On the other hand, betweenness and closeness centrality predicted more genes whose relation to the disease is currently unknown and are candidates for experimental study.
引用
收藏
页码:I277 / I285
页数:9
相关论文
共 50 条
  • [1] A literature based method for identifying gene-disease connections
    Adamic, LA
    Wilkinson, D
    Huberman, BA
    Adar, E
    CSB2002: IEEE COMPUTER SOCIETY BIOINFORMATICS CONFERENCE, 2002, : 109 - 117
  • [2] Predicting gene-disease associations from the heterogeneous network using graph embedding
    Wang, Xiaochan
    Gong, Yuchong
    Yi, Jing
    Zhang, Wen
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 504 - 511
  • [3] Identifying Candidate Gene-Disease Associations via Graph Neural Networks
    Cinaglia, Pietro
    Cannataro, Mario
    ENTROPY, 2023, 25 (06)
  • [4] Automatic extraction of gene-disease associations from literature using joint ensemble learning
    Bhasuran, Balu
    Natarajan, Jeyakumar
    PLOS ONE, 2018, 13 (07):
  • [5] EFFECTIVE TESTING OF GENE-DISEASE ASSOCIATIONS
    SWIFT, M
    KUPPER, LL
    CHASE, CL
    AMERICAN JOURNAL OF HUMAN GENETICS, 1990, 47 (02) : 266 - 274
  • [6] Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses
    Singh-Blom, U. Martin
    Natarajan, Nagarajan
    Tewari, Ambuj
    Woods, John O.
    Dhillon, Inderjit S.
    Marcotte, Edward M.
    PLOS ONE, 2013, 8 (05):
  • [7] Analyzing a co-occurrence gene-interaction network to identify disease-gene association
    Amira Al-Aamri
    Kamal Taha
    Yousof Al-Hammadi
    Maher Maalouf
    Dirar Homouz
    BMC Bioinformatics, 20
  • [8] Analyzing a co-occurrence gene-interaction network to identify disease-gene association
    Al-Aamri, Amira
    Taha, Kamal
    Al-Hammadi, Yousof
    Maalouf, Maher
    Homouz, Dirar
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [9] Genetic interaction mapping of AMD identifies potential gene-disease associations
    Kiang, Lee
    Huang, Jillian
    Tsuchida, Ryan
    Jayasundera, Kanishka
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2013, 54 (15)
  • [10] PotatoG-DKB: a potato gene-disease knowledge base mined from biological literature
    Xie, Congjiao
    Gao, Jing
    Chen, Junjie
    Zhao, Xuyang
    PEERJ, 2024, 12