Text-mining: Application development challenges

被引:0
|
作者
Varadarajan, S [1 ]
Kasravi, K [1 ]
Feldman, R [1 ]
机构
[1] Elect Data Syst Corp, Troy, MI 48098 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reviews the best practices and challenges for project managers and developers involved in implementing text-mining applications. With focus on rule-based information extraction, and references to actual cases, the authors share their experiences from having developed several text-mining applications in diverse industries. First, project management issues are discussed, including a process for capturing business requirements and mapping them into features and linguistic patterns, development of linguistic rules, rule development standards, performance metrics, and an evaluation methodology. Linguistic representations such as sub-syntactic, syntactic, semantic, and application-specific rules are identified. Special emphasis is placed on post-information extraction processing, such as improving the relevance of the extracted information, summarization models, techniques for handling typographical errors, resolution of temporal information, anaphora resolution, and a discussion on shallow vs. full parsing. Lastly, the paper discusses various utilities to help with the development of a text-mining application, such as feature analysis, visualization, source document pre-processing, and rule authoring tools.
引用
收藏
页码:247 / 260
页数:14
相关论文
共 50 条
  • [31] @Minter: automated text-mining of microbial interactions
    Lim, Kun Ming Kenneth
    Li, Chenhao
    Chng, Kern Rei
    Nagarajan, Niranjan
    BIOINFORMATICS, 2016, 32 (19) : 2981 - 2987
  • [32] Pathway Curation: Application of Text-Mining Tools eGIFT and RLIMS-P
    Schmidt, Carl J.
    Sun, Liang
    Arighi, Cecilia N.
    Decker, Keith
    Vijay-Shanker, K.
    Torii, Manabu
    Tudor, Catalina O.
    Wu, Cathy
    D'Eustachio, Peter
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [33] Four Text-Mining Methods for Measuring Elaboration
    Dumas, Denis
    Organisciak, Peter
    Maio, Shannon
    Doherty, Michael
    JOURNAL OF CREATIVE BEHAVIOR, 2021, 55 (02): : 517 - 531
  • [34] Integration of text-mining and telemedicine appointment optimization
    Ji, Menglei
    Mosaffa, Mohammad
    Ardestani-Jaafari, Amir
    Li, Jinlin
    Peng, Chun
    ANNALS OF OPERATIONS RESEARCH, 2023, 341 (1) : 621 - 645
  • [35] ChemicalTagger: A tool for semantic text-mining in chemistry
    Hawizy, Lezan
    Jessop, Dave M.
    Murray-Rust, Peter
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2010, 240
  • [36] Elsevier opens its papers to text-mining
    Van Noorden, Richard
    NATURE, 2014, 506 (7486) : 17 - 17
  • [37] Combination of text-mining algorithms increases the performance
    Malik, Rainer
    Franke, Lude
    Siebes, Arno
    BIOINFORMATICS, 2006, 22 (17) : 2151 - 2157
  • [38] A Chain of Text-mining to Extract Information in Archaeology
    Amrani, Ahmed
    Abajian, Vicken
    Kodratoff, Yves
    Matte-Tailliez, Oriane
    2008 3RD INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES: FROM THEORY TO APPLICATIONS, VOLS 1-5, 2008, : 12 - +
  • [39] ChemicalTagger: A tool for semantic text-mining in chemistry
    Lezan Hawizy
    David M Jessop
    Nico Adams
    Peter Murray-Rust
    Journal of Cheminformatics, 3
  • [40] USE OF TEXT-MINING TOOLS FOR SYSTEMATIC REVIEWS
    Paynter, R. A.
    Banez, L. L.
    Berliner, E.
    Erinoff, E.
    Lege-Matsuura, J. M.
    Potter, S.
    VALUE IN HEALTH, 2016, 19 (03) : A108 - A108