Use of figures in literature mining for biomedical digital libraries

被引:0
|
作者
Chen, Nawei [1 ]
Shatkay, Hagit [1 ]
Blostein, Dorothea [1 ]
机构
[1] Queens Univ, Sch Comp, Kingston, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The maintenance of biomedical digital libraries (including organism databases and protein databases) involves analysis of a large number of documents. Much work is done manually: curators study large numbers of biomedical documents while updating and annotating organism databases such as MGI (Mouse Genome Informatics) and Flybase (a database of the fruit-fly genome). We summarize the annotation process in organism databases, and describe some of the roles played by the Gene Ontology and by document databases such as PubMed Efforts are ongoing to automate parts of the annotation process. Biomedical text mining contests, such as the TREC Genomics Track [6, 7], define annotation subtasks, and provide training and test data. So far, these efforts have focused on the analysis of the text content of documents. We are investigating the analysis of figures in biomedical documents; the information derived from figure analysis may later be combined with the information derived from text analysis. We present an algorithm for using figures in document triage; triage involves determining which documents are relevant to a given annotation task. In our triage algorithm, we segment figures into subfigures and classify the subfigures as Graphical, Gel, Fluorescence Microscopy, and Other Microscopy. A secondary classification into subcategories is performed by clustering, using clusters created from the subfigures in the labeled training data. The classifications of all subfigures in a document are combined to form a document descriptor. The document descriptor is then classified using a Naive Bayes Classifier, as either relevant or irrelevant to the given annotation task.
引用
收藏
页码:180 / +
页数:4
相关论文
共 50 条
  • [1] Biomedical literature mining
    Hu, Xiaohua
    [J]. PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 1446 - 1446
  • [2] Impact of Context on Keyword Identification and Use in Biomedical Literature Mining
    Dasigi, Venu G.
    Karam, Orlando
    Pydimarri, Sailaja
    [J]. PROCEEDINGS OF THE FUTURE TECHNOLOGIES CONFERENCE (FTC) 2018, VOL 1, 2019, 880 : 505 - 516
  • [3] Modality Classification for Searching Figures in Biomedical Literature
    Xue, Zhiyun
    Rahman, Md Mahmudur
    Antani, Sameer
    Long, L. Rodney
    Denmer-Fushman, Dina
    Thoma, George R.
    [J]. 2016 IEEE 29TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS), 2016, : 152 - 157
  • [4] Text mining the biomedical literature
    Pertsemlidis, A
    [J]. BIOPHYSICAL JOURNAL, 2002, 82 (01) : 168A - 168A
  • [5] An Architecture for Information Extraction from Figures in Digital Libraries
    Choudhury, Sagnik Ray
    Giles, C. Lee
    [J]. WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 667 - 672
  • [6] Study on Data Mining in Digital Libraries
    Bin, Chen
    [J]. INFORMATION COMPUTING AND APPLICATIONS, ICICA 2013, PT II, 2013, 392 : 282 - 291
  • [7] Mining user communities in digital libraries
    Papatheodorou, C
    Kapidakis, S
    Sfakakis, M
    Vassiliou, A
    [J]. INFORMATION TECHNOLOGY AND LIBRARIES, 2003, 22 (04) : 152 - 157
  • [8] Structured literature image finder: Parsing text and figures in biomedical literature
    Ahmed, Amr
    Arnold, Andrew
    Coelho, Luis Pedro
    Kangas, Joshua
    Sheikh, Abdul-Saboor
    Xing, Eric
    Cohen, William
    Murphy, Robert F.
    [J]. JOURNAL OF WEB SEMANTICS, 2010, 8 (2-3): : 151 - 154
  • [9] Mining biomarker information in biomedical literature
    Erfan Younesi
    Luca Toldo
    Bernd Müller
    Christoph M Friedrich
    Natalia Novac
    Alexander Scheer
    Martin Hofmann-Apitius
    Juliane Fluck
    [J]. BMC Medical Informatics and Decision Making, 12
  • [10] Mining biomarker information in biomedical literature
    Younesi, Erfan
    Toldo, Luca
    Mueller, Bernd
    Friedrich, Christoph M.
    Novac, Natalia
    Scheer, Alexander
    Hofmann-Apitius, Martin
    Fluck, Juliane
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12