Mining impactful discoveries from the biomedical literature

被引:0
|
作者
Moreau, Erwan [1 ,2 ]
Hardiman, Orla [3 ]
Heverin, Mark [3 ]
O'Sullivan, Declan [1 ,2 ]
机构
[1] Trinity Coll Dublin, Adapt Ctr, Dublin, Ireland
[2] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin, Ireland
[3] Trinity Coll Dublin, Sch Med, Dublin, Ireland
来源
BMC BIOINFORMATICS | 2024年 / 25卷 / 01期
关键词
Literature-based discovery; Evaluation; Benchmark dataset; Time-sliced method; KNOWLEDGE; MEDLINE; MODELS;
D O I
10.1186/s12859-024-05881-9
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundLiterature-based discovery (LBD) aims to help researchers to identify relations between concepts which are worthy of further investigation by text-mining the biomedical literature. While the LBD literature is rich and the field is considered mature, standard practice in the evaluation of LBD methods is methodologically poor and has not progressed on par with the domain. The lack of properly designed and decent-sized benchmark dataset hinders the progress of the field and its development into applications usable by biomedical experts.ResultsThis work presents a method for mining past discoveries from the biomedical literature. It leverages the impact made by a discovery, using descriptive statistics to detect surges in the prevalence of a relation across time. The validity of the method is tested against a baseline representing the state-of-the-art "time-sliced" method.ConclusionsThis method allows the collection of a large amount of time-stamped discoveries. These can be used for LBD evaluation, alleviating the long-standing issue of inadequate evaluation. It might also pave the way for more fine-grained LBD methods, which could exploit the diversity of these past discoveries to train supervised models. Finally the dataset (or some future version of it inspired by our method) could be used as a methodological tool for systematic reviews. We provide an online exploration tool in this perspective, available at https://brainmend.adaptcentre.ie/.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] A Review of Recent Advancement in Integrating Omics Data with Literature Mining towards Biomedical Discoveries
    Raja, Kalpana
    Patrick, Matthew
    Gao, Yilin
    Madu, Desmond
    Yang, Yuyang
    Tsoi, Lam C.
    INTERNATIONAL JOURNAL OF GENOMICS, 2017, 2017
  • [2] Biomedical literature mining
    Hu, Xiaohua
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 1446 - 1446
  • [3] Corpus annotation for mining biomedical events from literature
    Kim, Jin-Dong
    Ohta, Tomoko
    Tsujii, Jun'ichi
    BMC BIOINFORMATICS, 2008, 9 (1)
  • [4] Mining Biomedical Entity from Literature Based on CRF
    Gong, Lejun
    Yang, Ronggen
    Feng, Jiacheng
    Yang, Geng
    PROCEEDINGS OF THE 2015 4TH NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING ( NCEECE 2015), 2016, 47 : 1436 - 1439
  • [5] Mining semantically related terms from biomedical literature
    Nenadić, Goran
    Ananiadou, Sophia
    ACM Transactions on Asian Language Information Processing, 2006, 5 (01): : 22 - 43
  • [6] Corpus annotation for mining biomedical events from literature
    Jin-Dong Kim
    Tomoko Ohta
    Jun'ichi Tsujii
    BMC Bioinformatics, 9
  • [7] Mining Meaningful Topics from Massive Biomedical Literature
    Zhu, Peiyan
    Shen, Junhui
    Sun, Dezhi
    Xu, Ke
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [8] Text mining the biomedical literature
    Pertsemlidis, A
    BIOPHYSICAL JOURNAL, 2002, 82 (01) : 168A - 168A
  • [9] Mining gene-related information from biomedical literature
    Tudor, Catalina O.
    Vijay-Shanker, K.
    Schmidt, Carl J.
    BIBMW: 2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOP, 2009, : 335 - 335
  • [10] Incorporating Zoning Information into Argument Mining from Biomedical Literature
    Liu, Boyang
    Schlegel, Viktor
    Batista-Navarro, Riza
    Ananiadou, Sophia
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6162 - 6169