New challenges for text mining: mapping between text and manually curated pathways

被引:31
|
作者
Oda, Kanae [1 ]
Kim, Jin-Dong [1 ]
Ohta, Tomoko [1 ]
Okanohara, Daisuke [1 ]
Matsuzaki, Takuya [1 ]
Tateisi, Yuka [2 ]
Tsujii, Jun'ichi [1 ,3 ,4 ]
机构
[1] Univ Tokyo, Dept Comp Sci, Grad Sch Informat Sci & Technol, Bunkyo Ku, Tokyo, Japan
[2] Kogakuin Univ, Fac Informat, Shinjuku Ku, Tokyo, Japan
[3] Univ Manchester, Sch Comp Sci, Manchester M13 9PL, Lancs, England
[4] Natl Ctr Text Min, Manchester M1 7DN, Lancs, England
关键词
Text Mining; Name Entity Recognition; GENIA Corpus; Text Mining Tool; Textual Context;
D O I
10.1186/1471-2105-9-S3-S5
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Associating literature with pathways poses new challenges to the Text Mining (TM) community. There are three main challenges to this task: (1) the identification of the mapping position of a specific entity or reaction in a given pathway, (2) the recognition of the causal relationships among multiple reactions, and (3) the formulation and implementation of required inferences based on biological domain knowledge. Results: To address these challenges, we constructed new resources to link the text with a model pathway; they are: the GENIA pathway corpus with event annotation and NF-kB pathway. Through their detailed analysis, we address the untapped resource, 'bio-inference,' as well as the differences between text and pathway representation. Here, we show the precise comparisons of their representations and the nine classes of 'bio-inference' schemes observed in the pathway corpus. Conclusions: We believe that the creation of such rich resources and their detailed analysis is the significant first step for accelerating the research of the automatic construction of pathway from text.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] New challenges for text mining: mapping between text and manually curated pathways
    Kanae Oda
    Jin-Dong Kim
    Tomoko Ohta
    Daisuke Okanohara
    Takuya Matsuzaki
    Yuka Tateisi
    Jun'ichi Tsujii
    [J]. BMC Bioinformatics, 9
  • [2] IBDDB: a manually curated and text-mining-enhanced database of genes involved in inflammatory bowel disease
    Khan, Farhat
    Radovanovic, Aleksandar
    Gojobori, Takashi
    Kaur, Mandeep
    [J]. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2021,
  • [3] Manually structured digital abstracts: A scaffold for automatic text mining
    Seringhaus, Michael
    Gerstein, Mark
    [J]. FEBS LETTERS, 2008, 582 (08) : 1170 - 1170
  • [4] New Challenges for Biological Text-Mining in the Next Decade
    Yen-Ching Chang
    Richard Tzong-Han Tsai
    Wen-Lian Hsu
    [J]. Journal of Computer Science & Technology, 2010, 25 (01) : 169 - 179
  • [5] New Challenges for Biological Text-Mining in the Next Decade
    Dai, Hong-Jie
    Chang, Yen-Ching
    Tsai, Richard Tzong-Han
    Hsu, Wen-Lian
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (01) : 169 - +
  • [6] New challenges and roles of metadata in text/data mining in statistics
    Soltés, D
    [J]. Knowledge Mining, 2005, 185 : 191 - 199
  • [7] Granular Computing for Text Mining: New Research Challenges and Opportunities
    Jing, Liping
    Lau, Raymond Y. K.
    [J]. ROUGH SETS, FUZZY SETS, DATA MINING AND GRANULAR COMPUTING, PROCEEDINGS, 2009, 5908 : 478 - +
  • [8] New challenges for biological text-mining in the next decade
    Dai H.-J.
    Chang Y.-C.
    Tzong-Han Tsai R.
    Hsu W.-L.
    [J]. Journal of Computer Science and Technology, 2010, 25 (1) : 169 - 179
  • [9] Text Mining: Techniques, Applications, and Challenges
    Justicia de la Torre, C.
    Sanchez, D.
    Blanco, I
    Martin-Bautista, M. J.
    [J]. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2018, 26 (04) : 553 - 582
  • [10] Text Mining: Challenges and Future Directions
    Akilan, A.
    [J]. 2015 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS), 2015, : 1679 - 1683