Identifying gene-disease associations using centrality on a literature mined gene-interaction network

被引:233
|
作者
Oezguer, Arzucan [1 ]
Vu, Thuy [1 ]
Erkan, Guenes [1 ]
Radev, Dragomir R. [1 ,2 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Sch Informat, Ann Arbor, MI 48109 USA
关键词
D O I
10.1093/bioinformatics/btn182
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Understanding the role of genetics in diseases is one of the most important aims of the biological sciences. The completion of the Human Genome Project has led to a rapid increase in the number of publications in this area. However, the coverage of curated databases that provide information manually extracted from the literature is limited. Another challenge is that determining disease-related genes requires laborious experiments. Therefore, predicting good candidate genes before experimental analysis will save time and effort. We introduce an automatic approach based on text mining and network analysis to predict gene-disease associations. We collected an initial set of known disease-related genes and built an interaction network by automatic literature mining based on dependency parsing and support vector machines. Our hypothesis is that the central genes in this disease-specific network are likely to be related to the disease. We used the degree, eigenvector, betweenness and closeness centrality metrics to rank the genes in the network. Results: The proposed approach can be used to extract known and to infer unknown gene-disease associations. We evaluated the approach for prostate cancer. Eigenvector and degree centrality achieved high accuracy. A total of 95% of the top 20 genes ranked by these methods are confirmed to be related to prostate cancer. On the other hand, betweenness and closeness centrality predicted more genes whose relation to the disease is currently unknown and are candidates for experimental study.
引用
收藏
页码:I277 / I285
页数:9
相关论文
共 50 条
  • [21] Gene-Disease Interaction Retrieval from Multiple Sources: A Network Based Method
    Huang, Lan
    Wang, Ye
    Wang, Yan
    Bai, Tian
    BIOMED RESEARCH INTERNATIONAL, 2016, 2016
  • [22] A knowledge-based approach for predicting gene-disease associations
    Zhou, Hongyi
    Skolnick, Jeffrey
    BIOINFORMATICS, 2016, 32 (18) : 2831 - 2838
  • [23] Graph embedding and ensemble learning for predicting gene-disease associations
    Wang, Haorui
    Wang, Xiaochan
    Yu, Zhouxin
    Zhang, Wen
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2020, 23 (04) : 360 - 379
  • [24] Identifying potential association on gene-disease network via dual hypergraph regularized least squares
    Hongpeng Yang
    Yijie Ding
    Jijun Tang
    Fei Guo
    BMC Genomics, 22
  • [25] Multilocus Bayesian Meta-analysis of Gene-disease Associations
    Newcombe, P. J.
    Verzilli, C.
    Pablo-Casas, J.
    Hingorani, Aroon
    Smeeth, L.
    Whittaker, J.
    GENETIC EPIDEMIOLOGY, 2009, 33 (08) : 828 - 828
  • [26] Investigations of Gene-Disease Associations: Costs and Benefits of Environmental Data
    Luo, Hao
    Burstyn, Igor
    Gustafson, Paul
    EPIDEMIOLOGY, 2013, 24 (04) : 562 - 568
  • [27] Identifying potential association on gene-disease network via dual hypergraph regularized least squares
    Yang, Hongpeng
    Ding, Yijie
    Tang, Jijun
    Guo, Fei
    BMC GENOMICS, 2021, 22 (01)
  • [28] Selection bias in meta-analyses of gene-disease associations
    Tang, JL
    PLOS MEDICINE, 2005, 2 (12): : 1226 - 1227
  • [29] Multilocus Bayesian Meta-Analysis of Gene-Disease Associations
    Newcombe, Paul J.
    Verzilli, Claudio
    Casas, Juan P.
    Hingorani, Aroon D.
    Smeeth, Liam
    Whittaker, John C.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2009, 84 (05) : 567 - 580
  • [30] Reporting, appraising, and integrating data on genotype prevalence and gene-disease associations
    Little, J
    Bradley, L
    Bray, MS
    Clyne, M
    Dorman, J
    Ellsworth, DL
    Hanson, J
    Khoury, M
    Lau, J
    O'Brien, TR
    Rothman, N
    Stroup, D
    Taioli, E
    Thomas, D
    Vainio, H
    Wacholder, S
    Weinberg, C
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2002, 156 (04) : 300 - 310