A Sequence Based Dynamic SOM Model for Text Clustering

被引:0
|
作者
Gunasinghe, Upuli [1 ]
Matharage, Sumith [1 ]
Alahakoon, Damminda [1 ]
机构
[1] Monash Univ, Fac IT, CCSL, Clayton, Vic 3800, Australia
关键词
Text clustering; Sequence learning; Growing Self Organizing Map; Text feature selection; Semantics;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text clustering can be considered as a four step process consisting of feature extraction, text representation, document clustering and cluster interpretation. Most text clustering models consider text as an unordered collection of words. However the semantics of text would be better captured if word sequences are taken into account. In this paper we propose a sequence based text clustering model where four novel sequence based components are introduced in each of the four steps in the text clustering process. Experiments conducted on the Reuters dataset and Sydney Morning Herald (SMH) news archives demonstrate the advantage of the proposed sequence based model, in terms of capturing context with semantics, accuracy and speed, compared to clustering of documents based on single words and n-gram based models.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Text Clustering Using PSO Based Dynamic Adaptive SOM for Detecting Emergent Trends
    Chandrakala, D.
    Sumathi, S.
    Kumar, Saran A.
    Sathish, J.
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2019, 15 (03) : 64 - 78
  • [2] Text clustering method based on genetic algorithm and SOM network
    Department of Information Technology, Guangxi Teachers Education University, Nanning 530001, China
    不详
    [J]. J. Comput. Inf. Syst., 2008, 3 (993-1000): : 993 - 1000
  • [3] A clustering algorithm for Chinese text based on SOM neural network and density
    Meng, ZQ
    Zhu, HC
    Zhu, YH
    Zhou, GG
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2005, PT 2, PROCEEDINGS, 2005, 3497 : 251 - 256
  • [4] Research on Text Clustering Algorithm Based on K_means and SOM
    Li Xinwu
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION WORKSHOP: IITA 2008 WORKSHOPS, PROCEEDINGS, 2008, : 341 - 344
  • [5] Research of fast SOM clustering for text information
    Liu, Yuan-chao
    Wu, Chong
    Liu, Ming
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (08) : 9325 - 9333
  • [6] Research on a hybrid patent clustering based on som model
    Gui, Jie
    Zhang, Zhaofeng
    Zhu, Xiaohua
    Li, Peng
    [J]. Gui, J. (guij@istic.ac.cn), 1839, ICIC Express Letters Office, Tokai University, Kumamoto Campus, 9-1-1, Toroku, Kumamoto, 862-8652, Japan (07): : 1839 - 1846
  • [7] A Feature Selection Methods Based on Concept Extraction and SOM Text Clustering Analysis
    Wang, Lin
    Jiang, Minghu
    Liao, Shasha
    Lu, Yinghua
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2006, 6 (1A): : 20 - 28
  • [8] Integrating contextual information to enhance SOM-based text document clustering
    Pullwitt, D
    [J]. NEURAL NETWORKS, 2002, 15 (8-9) : 1099 - 1106
  • [9] Sequence-based SOM: Visualizing transition of dynamic clusters
    Fukui, Ken-ichi
    Saito, Kazumi
    Kimura, Masahiro
    Numao, Masayuki
    [J]. 2008 IEEE 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2008, : 47 - +
  • [10] Intrusion Detection Classifier Based on Dynamic SOM and Swarm Intelligence Clustering
    Feng, Yong
    Zhong, Jiang
    Xiong, Zhong-yang
    Ye, Chun-xiao
    Wu, Kai-gui
    [J]. ADVANCES IN COGNITIVE NEURODYNAMICS, PROCEEDINGS, 2008, : 969 - +