Text-mining: Application development challenges

被引：0

作者：

Varadarajan, S ^{[1
]}

Kasravi, K ^{[1
]}

Feldman, R ^{[1
]}

机构：

[1] Elect Data Syst Corp, Troy, MI 48098 USA

来源：

APPLICATIONS AND INNOVATIONS IN INTELLIGENT SYSTEMS X | 2003年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper reviews the best practices and challenges for project managers and developers involved in implementing text-mining applications. With focus on rule-based information extraction, and references to actual cases, the authors share their experiences from having developed several text-mining applications in diverse industries. First, project management issues are discussed, including a process for capturing business requirements and mapping them into features and linguistic patterns, development of linguistic rules, rule development standards, performance metrics, and an evaluation methodology. Linguistic representations such as sub-syntactic, syntactic, semantic, and application-specific rules are identified. Special emphasis is placed on post-information extraction processing, such as improving the relevance of the extracted information, summarization models, techniques for handling typographical errors, resolution of temporal information, anaphora resolution, and a discussion on shallow vs. full parsing. Lastly, the paper discusses various utilities to help with the development of a text-mining application, such as feature analysis, visualization, source document pre-processing, and rule authoring tools.

引用

页码：247 / 260

页数：14

共 50 条

[1] Current challenges in text-mining for chemical information
Sayle, Roger
Mayfield, John
O'Boyle, Noel
ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2019, 258
[2] Text-mining in Terms of Methodology and Development
Isaeva, Ekaterina
Aldarova, Dinara
PROCEEDINGS OF THE 2021 IEEE CONFERENCE OF RUSSIAN YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING (ELCONRUS), 2021, : 413 - 416
[3] New Challenges for Biological Text-Mining in the Next Decade
Dai, Hong-Jie
Chang, Yen-Ching
Tsai, Richard Tzong-Han
Hsu, Wen-Lian
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (01) : 169 - +
[4] New challenges for biological text-mining in the next decade
Dai H.-J.
Chang Y.-C.
Tzong-Han Tsai R.
Hsu W.-L.
Journal of Computer Science and Technology, 2010, 25 (1) : 169 - 179
[5] New Challenges for Biological Text-Mining in the Next Decade
Yen-Ching Chang
Richard Tzong-Han Tsai
Wen-Lian Hsu
Journal of Computer Science & Technology, 2010, 25 (01) : 169 - 179
[6] Text-Mining and Neuroscience
Ambert, Kyle H.
Cohen, Aaron M.
BIOINFORMATICS OF BEHAVIOR: PART 1, 2012, 103 : 109 - 132
[7] Text-mining approach to evaluate terms for ontology development
Tsoi, Lam C.
Patel, Ravi
Zhao, Wenle
Zheng, W. Jim
JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (05) : 824 - 830
[8] Text-Mining the Voice of the People
Evangelopoulos, Nicholas
Visinescu, Lucian
COMMUNICATIONS OF THE ACM, 2012, 55 (02) : 55 - 62
[9] Maximizing text-mining performance
Weiss, SM
Apte, C
Damerau, FJ
Johnson, DE
Oles, FJ
Goetz, T
Hampp, T
IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1999, 14 (04): : 63 - 69
[10] Maximizing text-mining performance
Weiss, Sholom M.
Apte, Chidanand
Damerau, Fred J.
Johnson, David E.
Oles, Frank J.
Goetz, Thilo
Hampp, Thomas
IEEE Intelligent Systems and Their Applications, 14 (04): : 63 - 69

← 1 2 3 4 5 →