Text mining in different languages

被引:0
|
作者
Lebart, L [1 ]
机构
[1] Ecole Natl Super Telecommun, CNRS, F-75013 Paris, France
来源
关键词
Text Mining; text categorization; language independent methods; discriminant analysis;
D O I
暂无
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The purpose of Text Mining is to describe and explore textual data, to uncover structural traits, and proceed to predictions. The field of application concerns Information Retrieval, processing responses to open-ended questions in sample surveys as well as processing textual corpora of a more general nature. At the intersection of Corpora Linguistics and Exploratory Statistical Analysis, a series of language independent tools and methods can perform most of the previously mentioned tasks, including the assessment and validation of the obtained results, be it visualization or categorization. Multiple confusion matrices calculated on test-samples characterize the quality of the prediction as well as the structure of errors of prediction. In the case of multinational surveys and corpora, they allow us to proceed to comparisons among several countries, in spite of the very heterogeneous character of the basic information (texts in different languages). Copyright (C) 1998 John Wiley & Sons, Ltd.
引用
收藏
页码:323 / 334
页数:12
相关论文
共 50 条
  • [31] The concept of "text mining"
    Kühnhold, M
    WIRTSCHAFTSINFORMATIK, 2000, 42 (02): : 175 - 179
  • [32] Mining online text
    Knight, K
    COMMUNICATIONS OF THE ACM, 1999, 42 (11) : 58 - 61
  • [33] Ambiguity in text mining
    Al Fawareh, Heiab Ma'azer
    Jusoh, Shaidah
    Osman, Wan Rozaini Sheikh
    2008 INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING, VOLS 1-3, 2008, : 1172 - 1176
  • [34] A Data-Mining Based Study of Security Vulnerability Types and Their Mitigation in Different Languages
    Antal, Gabor
    Mosolygo, Balazs
    Vandor, Norbert
    Hegedus, Peter
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2020, PART IV, 2020, 12252 : 1019 - 1034
  • [35] Mining text with Pimiento
    Garcia Adeva, Juan Jose
    Calvo, Rafael
    IEEE INTERNET COMPUTING, 2006, 10 (04) : 27 - 35
  • [36] A Review on Text Mining
    Zhang, Yu
    Chen, Mengdong
    Liu, Lianzhong
    PROCEEDINGS OF 2015 6TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE, 2015, : 681 - 685
  • [37] Technology of text mining
    Visa, A
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, 2001, 2123 : 1 - 11
  • [38] Text Mining for Indexing
    Gelernter, Judith
    Lesk, Michael
    JCDL 09: PROCEEDINGS OF THE 2009 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES, 2009, : 467 - 467
  • [39] Techniques on Text Mining
    Sukanya, M.
    Biruntha, S.
    2012 IEEE INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2012, : 269 - 271
  • [40] Practical text mining
    Feldman, R
    PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 1510 : 478 - 478