A new text clustering method using hidden Markov model

被引:0
|
作者
Fu, Yan [1 ]
Yang, Dongqing [1 ]
Tang, Shiwei [2 ]
Wang, Tengjiao [1 ]
Gao, Aiqiang [1 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China
[2] Peking Univ, Natl Lab Machine Percept, Beijing 100871, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Being high-dimensional and relevant in semantics, text clustering is still an important topic in data mining. However, little work has been done to investigate attributes of clustering process, and previous studies just focused on characteristics of text itself. As a dynamic and sequential process, we aim to describe text clustering as state transitions for words or documents. Taking K-means clustering method as example, we try to parse the clustering process into several sequences. Based on research of sequential and temporal data clustering, we propose a new text clustering method using HMM(Hidden Markov Model). And through the experiments on Reuters-21578, the results show that this approach provides an accurate clustering partition, and achieves better performance rates compared with K-means algorithm.
引用
收藏
页码:73 / +
页数:3
相关论文
共 50 条
  • [41] Direct training of subspace distribution clustering hidden Markov model
    Mak, BKW
    Bocchieri, E
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (04): : 378 - 387
  • [42] WORD SEGMENTATION BASED ON HIDDEN MARKOV MODEL USING MARKOV CHAIN MONTE CARLO METHOD
    Fukuda, Takuya
    Miura, Takao
    [J]. ICAART 2009: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, 2009, : 123 - +
  • [43] Malware Detection Using Hidden Markov Model based on Markov Blanket Feature Selection Method
    Pechaz, Bassir
    Jahan, Majid Vafaie
    Jalali, Mehrdad
    [J]. SECOND INTERNATIONAL CONGRESS ON TECHNOLOGY, COMMUNICATION AND KNOWLEDGE (ICTCK 2015), 2015, : 558 - 563
  • [44] A New Method for Markovian Adaptation of the Non-Markovian Queueing System Using the Hidden Markov Model
    Tanackov, Ilija
    Prentkovskis, Olegas
    Jevtic, Zarko
    Stojic, Gordan
    Ercegovac, Pamela
    [J]. ALGORITHMS, 2019, 12 (07):
  • [45] A Hidden Markov Model for iris recognition method
    Wang Tong
    He Pi-Lian
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION, VOLS 1-7, 2007, : 2379 - +
  • [46] Stock market forecasting using Hidden Markov Model: A new approach
    Hassan, R
    Nath, B
    [J]. 5TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, PROCEEDINGS, 2005, : 192 - 196
  • [47] Novel Text Recognition Based on Modified K-Clustering and Hidden Markov Models
    Victor R. L. Shen
    Gwo-Jen Chiou
    Yi-Nan Lin
    Jhao-Yuan Jhan
    [J]. Wireless Personal Communications, 2020, 111 : 1453 - 1474
  • [48] Novel Text Recognition Based on Modified K-Clustering and Hidden Markov Models
    Shen, Victor R. L.
    Chiou, Gwo-Jen
    Lin, Yi-Nan
    Jhan, Jhao-Yuan
    [J]. WIRELESS PERSONAL COMMUNICATIONS, 2020, 111 (03) : 1453 - 1474
  • [49] TCBLSA: A new method of text clustering
    Xu, JS
    Wang, ZO
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 63 - 66
  • [50] Clustering Multivariate Time Series Using Hidden Markov Models
    Ghassempour, Shima
    Girosi, Federico
    Maeder, Anthony
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2014, 11 (03) : 2741 - 2763