Text mining neuroscience journal articles to populate neuroscience databases

被引:0
|
作者
Chiquito J. Crasto
Luis N. Marenco
Michele Migliore
Buqing Mao
Prakash M. Nadkarni
Perry Miller
Gordon M. Shepherd
机构
[1] Yale University,Center for Medical Informatics
[2] Yale University,Department of Neurobiology
[3] Yale University,Department of Anesthesiology
[4] Yale University,Department of Molecular, Cellular, and Developmental Biology
[5] Institute of Biophysics,undefined
[6] National Research Council,undefined
来源
Neuroinformatics | 2003年 / 1卷
关键词
Text mining; natural language processing; neuroscience; databases; supervised and unsupervised learning;
D O I
暂无
中图分类号
学科分类号
摘要
We have developed a program NeuroText to populate the neuroscience databases in SenseLab (http://senselab.med.yale.edu/senselab) by mining the natural language text of neuroscience articles. NeuroText uses a two-step approach to identify relevant articles. The first step (pre-processing), aimed at 100% sensitivity, identifies abstracts containing database keywords. In the second step, potentially relveant abstracts identified in the first step are processed for specificity dictated by database architecture, and neuroscience, lexical and semantic contexts. NeuroText results were presented to the experts for validation using a dynamically generated interface that also allows expert-validated articles to be automatically deposited into the databases. Of the test set of 912 articles, 735 were rejected at the pre-processing step. For the remaining articles, the accuracy of predicting database-relevant articles was 85%. Twenty-two articles were erroneously identified. NeuroText deferred decisions on 29 articles to the expert. A comparison of NeuroText results versus the experts’ analyses revealed that the program failed to correctly identify articles’ relevance due to concepts that did not yet exist in the knowledgebase or due to vaguely presented information in the abstracts. NeuroText uses two “evolution” techniques (supervised and unsupervised) that play an important role in the continual improvement of the retrieval results. Software that uses the NeuroText approach can facilitate the creation of curated, special-interest, bibliography databases.
引用
收藏
页码:215 / 237
页数:22
相关论文
共 50 条
  • [1] Text mining neuroscience journal articles to populate neuroscience databases
    Crasto, CJ
    Marenco, LN
    Migliore, M
    Mao, BQ
    Nadkarni, PM
    Miller, P
    Shepherd, GM
    NEUROINFORMATICS, 2003, 1 (03) : 215 - 237
  • [2] Text Mining for Neuroscience
    Tirupattur, Naveen
    Lapish, Christopher C.
    Mukhopadhyay, Snehasis
    2011 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL MODELS FOR LIFE SCIENCES (CMLS-11), 2011, 1371 : 118 - 127
  • [3] Text-Mining and Neuroscience
    Ambert, Kyle H.
    Cohen, Aaron M.
    BIOINFORMATICS OF BEHAVIOR: PART 1, 2012, 103 : 109 - 132
  • [4] Using text mining to link journal articles to neuroanatomical databases
    French, Leon
    Pavlidis, Paul
    JOURNAL OF COMPARATIVE NEUROLOGY, 2012, 520 (08) : 1772 - 1783
  • [5] A pilot approach to automated and efficient transfer of neuroscience data from journal articles to searchable databases
    Crasto, CJ
    Marenco, L
    Nadkarni, P
    Miller, P
    Shepherd, G
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2001, : 883 - 883
  • [6] DATABASES FOR NEUROSCIENCE
    WERTHEIM, SL
    SIDMAN, RL
    NATURE, 1991, 354 (6348) : 88 - 89
  • [8] Imaging databases and neuroscience
    Toga, AW
    NEUROSCIENTIST, 2002, 8 (05): : 423 - 436
  • [9] A new feature for the Journal of Neuroscience:: New directions in neuroscience
    Shepherd, GM
    JOURNAL OF NEUROSCIENCE, 2001, 21 (21): : 8303 - 8303
  • [10] Supporting databases for neuroscience research
    Shepherd, GM
    JOURNAL OF NEUROSCIENCE, 2002, 22 (05): : 1497 - 1497