Biomedical text mining for research rigor and integrity: tasks, challenges, directions

被引:34
|
作者
Kilicoglu, Halil [1 ]
机构
[1] US Natl Lib Med, Lister Hill Natl Ctr Biomed Commun, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
biomedical research waste; biomedical text mining; natural language processing; research rigor; research integrity; reproducibility; AUTOMATIC RECOGNITION; PLAGIARISM; ARTICLES; CITATION; KNOWLEDGE; REPRODUCIBILITY; CLASSIFICATION; EXTRACTION; SENTENCES; MEDICINE;
D O I
10.1093/bib/bbx057
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
An estimated quarter of a trillion US dollars is invested in the biomedical research enterprise annually. There is growing alarm that a significant portion of this investment is wasted because of problems in reproducibility of research findings and in the rigor and integrity of research conduct and reporting. Recent years have seen a flurry of activities focusing on standardization and guideline development to enhance the reproducibility and rigor of biomedical research. Research activity is primarily communicated via textual artifacts, ranging from grant applications to journal publications. These artifacts can be both the source and the manifestation of practices leading to research waste. For example, an article may describe a poorly designed experiment, or the authors may reach conclusions not supported by the evidence presented. In this article, we pose the question of whether biomedical text mining techniques can assist the stakeholders in the biomedical research enterprise in doing their part toward enhancing research integrity and rigor. In particular, we identify four key areas in which text mining techniques can make a significant contribution: plagiarism/fraud detection, ensuring adherence to reporting guidelines, managing information overload and accurate citation/enhanced bibliometrics. We review the existing methods and tools for specific tasks, if they exist, or discuss relevant research that can provide guidance for future work. With the exponential increase in biomedical research output and the ability of text mining approaches to perform automatic tasks at large scale, we propose that such approaches can support tools that promote responsible research practices, providing significant benefits for the biomedical research enterprise.
引用
收藏
页码:1400 / 1414
页数:15
相关论文
共 50 条
  • [1] Text Mining: Challenges and Future Directions
    Akilan, A.
    2015 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS), 2015, : 1679 - 1683
  • [2] Advancing Chinese biomedical text mining with community challenges
    Zong, Hui
    Wu, Rongrong
    Cha, Jiaxue
    Feng, Weizhe
    Wu, Erman
    Li, Jiakun
    Shao, Aibin
    Tao, Liang
    Li, Zuofeng
    Tang, Buzhou
    Shen, Bairong
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 157
  • [3] Training Word Embeddings for Deep Learning in Biomedical Text Mining Tasks
    Jiang, Zhenchao
    Li, Lishuang
    Huang, Degen
    Jin, Liuke
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 625 - 628
  • [4] Research on Text Mining of Biomedical Field Based on Pubmed
    Li, Kang
    Dai, Weidi
    Wang, Wenjun
    Song, Ruixin
    PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON MACHINERY, ELECTRONICS AND CONTROL SIMULATION (MECS 2017), 2017, 138 : 182 - 187
  • [5] Biomedical text mining and its applications in cancer research
    Zhu, Fei
    Patumcharoenpol, Preecha
    Zhang, Cheng
    Yang, Yang
    Chan, Jonathan
    Meechai, Asawin
    Vongsangnak, Wanwipa
    Shen, Bairong
    JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (02) : 200 - 211
  • [6] Opportunities and challenges of text mining in materials research
    Kononova, Olga
    He, Tanjin
    Huo, Haoyan
    Trewartha, Amalie
    Olivetti, Elsa A.
    Ceder, Gerbrand
    ISCIENCE, 2021, 24 (03)
  • [7] Pressing needs of biomedical text mining in biocuration and beyond: opportunities and challenges
    Singhal, Ayush
    Leaman, Robert
    Catlett, Natalie
    Lemberger, Thomas
    McEntyre, Johanna
    Polson, Shawn
    Xenarios, Ioannis
    Arighi, Cecilia
    Lu, Zhiyong
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2016,
  • [8] Biomedical Domain-Oriented Word Embeddings via Small Background Texts for Biomedical Text Mining Tasks
    Li, Lishuang
    Wan, Jia
    Huang, Degen
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 554 - 564
  • [9] Text mining the biomedical literature
    Pertsemlidis, A
    BIOPHYSICAL JOURNAL, 2002, 82 (01) : 168A - 168A