Beegle: from literature mining to disease-gene discovery

被引:19
|
作者
ElShal, Sarah [1 ,2 ]
Tranchevent, Leon-Charles [1 ,2 ,3 ,4 ,5 ]
Sifrim, Alejandro [1 ,2 ,6 ]
Ardeshirdavani, Amin [1 ,2 ]
Davis, Jesse [7 ]
Moreau, Yves [1 ,2 ]
机构
[1] Katholieke Univ Leuven, Dept Elect Engn ESAT STADIUS Ctr Dynam Syst, Signal Proc & Data Analyt Dept, B-3001 Leuven, Belgium
[2] Katholieke Univ Leuven, iMinds Future Hlth Dept, B-3001 Leuven, Belgium
[3] Canc Res Ctr Lyon, INSERM, UMR S1052, CNRS,UMR5286, Lyon, France
[4] Univ Lyon 1, F-69622 Villeurbanne, France
[5] Ctr Leon Berard, F-69373 Lyon, France
[6] Wellcome Trust Sanger Inst, Wellcome Trust Genome Campus, Cambridge CB10 1SA, England
[7] Katholieke Univ Leuven, Dept Comp Sci DTAI, B-3001 Leuven, Belgium
关键词
CANDIDATE GENES; PRIORITIZATION; ASSOCIATION; IDENTIFICATION;
D O I
10.1093/nar/gkv905
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeav-our (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] From Gene discovery to Disease Mechanisms
    Alarcon-Riquelme, M.
    CLINICAL AND EXPERIMENTAL RHEUMATOLOGY, 2015, 33 (03) : S3 - S3
  • [42] Literature mining in support of drug discovery
    Agarwal, Pankaj
    Searls, David B.
    BRIEFINGS IN BIOINFORMATICS, 2008, 9 (06) : 479 - 492
  • [43] NTreceptorDB: a Database of Polymorphisms and Disease-Gene Associations in Behavioral Disorders
    Musa, Aliyu Kabir
    Varoglu, Ekrem
    Taneri, Bahar
    PROCEEDINGS IWBBIO 2013: INTERNATIONAL WORK-CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, 2013, : 69 - +
  • [44] INTEGRO: an algorithm for data-integration and disease-gene association
    Cinaglia, Pietro
    Guzzi, Pietro H.
    Veltri, Pierangelo
    PROCEEDINGS 2018 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2018, : 2076 - 2081
  • [45] Dupilumab induced ocular surface diseases: an analysis of FAERS database, literature review and disease-gene interaction networks
    Chen, Jiaojiao
    Li, Huixiang
    Zhang, Huiyuan
    Shentu, Qiaoqiao
    Wang, Shaoxia
    Zhao, Quan
    Wang, Yinglin
    Wang, Fei
    EXPERT OPINION ON DRUG SAFETY, 2025,
  • [46] PCAN: phenotype consensus analysis to support disease-gene association
    Godard, Patrice
    Page, Matthew
    BMC BIOINFORMATICS, 2016, 17
  • [47] Disease-gene prediction based on preserving structure network embedding
    Ma, Jinlong
    Qin, Tian
    Xiang, Ju
    FRONTIERS IN AGING NEUROSCIENCE, 2023, 15
  • [48] A comprehensive database of exosome molecular biomarkers and disease-gene associations
    Qi, Yue
    Xu, Rongji
    Song, Chengxin
    Hao, Ming
    Gao, Yue
    Xin, Mengyu
    Liu, Qian
    Chen, Hongyan
    Wu, Xiaoting
    Sun, Rui
    Zhang, Yuanfu
    He, Danni
    Dai, Yifan
    Kong, Congcong
    Ning, Shangwei
    Guo, Qiuyan
    Zhang, Guangmei
    Wang, Peng
    SCIENTIFIC DATA, 2024, 11 (01)
  • [49] PCAN: phenotype consensus analysis to support disease-gene association
    Patrice Godard
    Matthew Page
    BMC Bioinformatics, 17
  • [50] A comprehensive database of exosome molecular biomarkers and disease-gene associations
    Yue Qi
    Rongji Xu
    Chengxin Song
    Ming Hao
    Yue Gao
    Mengyu Xin
    Qian Liu
    Hongyan Chen
    Xiaoting Wu
    Rui Sun
    Yuanfu Zhang
    Danni He
    Yifan Dai
    Congcong Kong
    Shangwei Ning
    Qiuyan Guo
    Guangmei Zhang
    Peng Wang
    Scientific Data, 11