Text-mining: Application development challenges

被引:0
|
作者
Varadarajan, S [1 ]
Kasravi, K [1 ]
Feldman, R [1 ]
机构
[1] Elect Data Syst Corp, Troy, MI 48098 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper reviews the best practices and challenges for project managers and developers involved in implementing text-mining applications. With focus on rule-based information extraction, and references to actual cases, the authors share their experiences from having developed several text-mining applications in diverse industries. First, project management issues are discussed, including a process for capturing business requirements and mapping them into features and linguistic patterns, development of linguistic rules, rule development standards, performance metrics, and an evaluation methodology. Linguistic representations such as sub-syntactic, syntactic, semantic, and application-specific rules are identified. Special emphasis is placed on post-information extraction processing, such as improving the relevance of the extracted information, summarization models, techniques for handling typographical errors, resolution of temporal information, anaphora resolution, and a discussion on shallow vs. full parsing. Lastly, the paper discusses various utilities to help with the development of a text-mining application, such as feature analysis, visualization, source document pre-processing, and rule authoring tools.
引用
收藏
页码:247 / 260
页数:14
相关论文
共 50 条
  • [1] Current challenges in text-mining for chemical information
    Sayle, Roger
    Mayfield, John
    O'Boyle, Noel
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 258
  • [2] Text-mining in Terms of Methodology and Development
    Isaeva, Ekaterina
    Aldarova, Dinara
    [J]. PROCEEDINGS OF THE 2021 IEEE CONFERENCE OF RUSSIAN YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING (ELCONRUS), 2021, : 413 - 416
  • [3] New Challenges for Biological Text-Mining in the Next Decade
    Dai, Hong-Jie
    Chang, Yen-Ching
    Tsai, Richard Tzong-Han
    Hsu, Wen-Lian
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (01): : 169 - +
  • [4] New Challenges for Biological Text-Mining in the Next Decade
    Yen-Ching Chang
    Richard Tzong-Han Tsai
    Wen-Lian Hsu
    [J]. Journal of Computer Science & Technology, 2010, 25 (01) : 169 - 179
  • [5] New challenges for biological text-mining in the next decade
    Dai H.-J.
    Chang Y.-C.
    Tzong-Han Tsai R.
    Hsu W.-L.
    [J]. Journal of Computer Science and Technology, 2010, 25 (1): : 169 - 179
  • [6] Text-Mining and Neuroscience
    Ambert, Kyle H.
    Cohen, Aaron M.
    [J]. BIOINFORMATICS OF BEHAVIOR: PART 1, 2012, 103 : 109 - 132
  • [7] Text-mining approach to evaluate terms for ontology development
    Tsoi, Lam C.
    Patel, Ravi
    Zhao, Wenle
    Zheng, W. Jim
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (05) : 824 - 830
  • [8] Text-Mining the Voice of the People
    Evangelopoulos, Nicholas
    Visinescu, Lucian
    [J]. COMMUNICATIONS OF THE ACM, 2012, 55 (02) : 55 - 62
  • [9] Maximizing text-mining performance
    Weiss, SM
    Apte, C
    Damerau, FJ
    Johnson, DE
    Oles, FJ
    Goetz, T
    Hampp, T
    [J]. IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (04): : 63 - 69
  • [10] Text-mining assisted regulatory annotation
    Aerts, Stein
    Haeussler, Maximilian
    van Vooren, Steven
    Griffith, Obi L.
    Hulpiau, Paco
    Jones, Steven J. M.
    Montgomery, Stephen B.
    Bergman, Casey M.
    [J]. GENOME BIOLOGY, 2008, 9 (02)