Text mining in different languages

被引:0
|
作者
Lebart, L [1 ]
机构
[1] Ecole Natl Super Telecommun, CNRS, F-75013 Paris, France
来源
关键词
Text Mining; text categorization; language independent methods; discriminant analysis;
D O I
暂无
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The purpose of Text Mining is to describe and explore textual data, to uncover structural traits, and proceed to predictions. The field of application concerns Information Retrieval, processing responses to open-ended questions in sample surveys as well as processing textual corpora of a more general nature. At the intersection of Corpora Linguistics and Exploratory Statistical Analysis, a series of language independent tools and methods can perform most of the previously mentioned tasks, including the assessment and validation of the obtained results, be it visualization or categorization. Multiple confusion matrices calculated on test-samples characterize the quality of the prediction as well as the structure of errors of prediction. In the case of multinational surveys and corpora, they allow us to proceed to comparisons among several countries, in spite of the very heterogeneous character of the basic information (texts in different languages). Copyright (C) 1998 John Wiley & Sons, Ltd.
引用
收藏
页码:323 / 334
页数:12
相关论文
共 50 条
  • [21] A customizable text classifier for text mining
    Zhang, Yun-Liang
    Zhang, Quan
    Data Science Journal, 2007, 6 (SUPPL.)
  • [22] Semantic Pattern Mining for Text Mining
    Song, Xiaoli
    Wang, XiaoTong
    Hu, Xiaohua
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 150 - 155
  • [23] Data Mining and Text Mining - A Survey
    Suresh, R.
    Harshni, S. R.
    2017 INTERNATIONAL CONFERENCE ON COMPUTATION OF POWER, ENERGY INFORMATION AND COMMUNICATION (ICCPEIC), 2017, : 412 - 419
  • [24] Text Mining-Implementation of Extract Summarization in a Text Mining Application
    Akbar, Ali
    Sultan, Ahmer
    Mustafa, Atika
    INTERNATIONAL SYMPOSIUM OF INFORMATION TECHNOLOGY 2008, VOLS 1-4, PROCEEDINGS: COGNITIVE INFORMATICS: BRIDGING NATURAL AND ARTIFICIAL KNOWLEDGE, 2008, : 698 - 703
  • [25] On monotone data mining languages
    Calders, T
    Wijsen, J
    DATABASE PROGRAMMING LANGUAGES, 2002, 2397 : 119 - 132
  • [26] Logical languages for data mining
    Giannotti, F
    Manco, G
    Wijsen, J
    LOGICS FOR EMERGING APPLICATIONS OF DATABASES, 2004, : 325 - 361
  • [27] Multilingual text mining
    Neri, F
    Data Mining VI: Data Mining, Text Mining and Their Business Applications, 2005, : 89 - 94
  • [28] Data mining on text
    Clifton, C
    Steinheiser, R
    TWENTY-SECOND ANNUAL INTERNATIONAL COMPUTER SOFTWARE & APPLICATIONS CONFERENCE - PROCEEDINGS, 1998, : 630 - 635
  • [29] Text Mining for Neuroscience
    Tirupattur, Naveen
    Lapish, Christopher C.
    Mukhopadhyay, Snehasis
    2011 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL MODELS FOR LIFE SCIENCES (CMLS-11), 2011, 1371 : 118 - 127
  • [30] Text mining on Internet
    Yin, Jian
    Chen, Ju-Hua
    Liu, Bin
    Xiaoxing Weixing Jisuanji Xitong/Mini-Micro Systems, 2002, 23 (11):