Creation of Text Document Matrices and Visualization by Self-Organizing Map

被引:8
|
作者
Stefanovic, Pavel [1 ]
Kurasova, Olga [1 ]
机构
[1] Vilnius Univ, Inst Math & Informat, LT-08663 Vilnius, Lithuania
来源
INFORMATION TECHNOLOGY AND CONTROL | 2014年 / 43卷 / 01期
关键词
self-organizing map; text mining; text document matrix; document dictionary; quantization error; SOM quality measures; common word list;
D O I
10.5755/j01.itc.43.1.4299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the paper, text mining and visualization by self-organizing map (SOM) are investigated. At first, textual information must be converted into numerical one. The results of text mining and visualization depend on the conversion. So, the influence of some control factors (the common word list and usage of the stemming algorithm) on text mining results, when a document dictionary is created, is investigated. A self-organizing map is used for text clustering and graphical representation (visualization). A comparative analysis is made where a dataset consists of scientific papers about the optimization, based on Pareto, simplex, and genetic algorithms. Two new measures are also proposed to estimate the SOM quality when the classified data are analyzed: distances between SOM cells, corresponding to data items assigned to the same class, and the distance between centers of SOM cells, corresponding to different classes. The quantization error is measured to estimate the SOM quality, too.
引用
收藏
页码:36 / 45
页数:10
相关论文
共 50 条
  • [1] A self-organizing map based approach for document clustering and visualization
    Yen, Gary G.
    Wu, Zheng
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 3279 - +
  • [2] Self-Organizing Map in Process Visualization
    Sirola, Miki
    Talonen, Jaakko
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT II: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6882 : 196 - 202
  • [3] Visualization and Integration of Databases using Self-Organizing Map
    Bourennani, Farid
    Pu, Ken Q.
    Zhu, Ying
    [J]. 2009 FIRST INTERNATIONAL CONFERENCE ON ADVANCES IN DATABASES, KNOWLEDGE, AND DATA APPLICATIONS, 2009, : 155 - 160
  • [4] Comparison of visualization of optimal clustering using self-organizing map and growing hierarchical self-organizing map in cellular manufacturing system
    Chattopadhyay, Manojit
    Dan, Pranab K.
    Mazumdar, Sitanath
    [J]. APPLIED SOFT COMPUTING, 2014, 22 : 528 - 543
  • [5] The self-organizing map
    Kohonen, T
    [J]. NEUROCOMPUTING, 1998, 21 (1-3) : 1 - 6
  • [6] The self-organizing map
    Helsinki University of Technology, Neural Networks Res. Ctr., P.O. B., FIN-02015 HUT, Finland
    [J]. Neurocomputing, 1-3 (1-6):
  • [7] THE SELF-ORGANIZING MAP
    KOHONEN, T
    [J]. PROCEEDINGS OF THE IEEE, 1990, 78 (09) : 1464 - 1480
  • [8] Process state and progress visualization using self-organizing map
    Hakala, Risto
    Simila, Timo
    Sirola, Miki
    Parviainen, Jukka
    [J]. INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2006, PROCEEDINGS, 2006, 4224 : 73 - 80
  • [9] Classification of species by information entropy and visualization by self-organizing map
    Nishimuta, Kentaro
    Yoshihara, Ikuo
    Yamamori, Kunihito
    Yasunaga, Moritoshi
    [J]. PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 16TH '11), 2011, : 371 - 374
  • [10] Clustering and visualization of bankruptcy trajectory using self-organizing map
    Chen, Ning
    Ribeiro, Bernardete
    Vieira, Armando
    Chen, An
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (01) : 385 - 393