A Text Document Clustering Method Based on Topical Concept

被引:0
|
作者
Ding, Yi [1 ]
Fu, Xian [1 ]
机构
[1] Hubei Normal Univ, Coll Comp Sci & Technol, Huangshi, Peoples R China
关键词
document clustering; clusters indexing; topical concept;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, document clustering technology has been extensively used in text mining, information retrieval systems and etc. The conventional document clustering methods rely on the classical vector-space model using the key words as the feature. However, these methods ignore the semantic relations among the keywords, do not really address the special problems of document clustering: high dimensionality of the data, and high computation complexity. To solve these problems, based on topic concept clustering, this paper proposes a method for Chinese document clustering. In this paper, we introduce a novel topical document clustering method called Document Features Indexing Clustering (DFIC), which can identify topics accurately and cluster documents according to these topics. In DFIC, "topic elements" are defined and extracted for indexing base clusters. Additionally, document features are investigated and exploited. Experimental results show that DFIC can gain a higher precision (92.76%) than some widely used traditional clustering methods.
引用
收藏
页码:547 / 552
页数:6
相关论文
共 50 条
  • [1] A Weighted Topical Document Embedding based Clustering Method for News Text
    Zhu Dechao
    Song Hui
    [J]. 2016 IEEE INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2016, : 1060 - 1065
  • [2] A Text Document Clustering Method Based on Ontology
    Ding, Yi
    Fu, Xian
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2011, PT II, 2011, 6676 : 199 - 206
  • [3] The Research of Document Clustering Topical Concept Based on Neural Networks
    Fu, Xian
    Ding, Yi
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2014, 2014, 8866 : 621 - 628
  • [4] Text document clustering and the space of concept on text document automatically generated
    Fu, WP
    Wu, B
    He, Q
    Shi, ZZ
    [J]. 2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C107 - C112
  • [5] A combined approach of formal concept analysis and text mining for concept based document clustering
    Myat, NN
    Hla, KHS
    [J]. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings, 2005, : 330 - 333
  • [6] A Text Document Clustering Method Based on Weighted BERT Model
    Li, Yutong
    Cai, Juanjuan
    Wang, Jingling
    [J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 1426 - 1430
  • [7] Text document clustering based on neighbors
    Luo, Congnan
    Li, Yanjun
    Chung, Soon M.
    [J]. DATA & KNOWLEDGE ENGINEERING, 2009, 68 (11) : 1271 - 1288
  • [8] Text Document Clustering with Ontology Applying Modify Concept Weighting
    Tar, Hmway Hmway
    Khaing, Myint Myint
    [J]. GENETIC AND EVOLUTIONARY COMPUTING, VOL II, 2016, 388 : 431 - 438
  • [9] Concept chain based text clustering
    Song, SX
    Zhang, JA
    Li, CP
    [J]. COMPUTATIONAL INTELLIGENCE AND SECURITY, PT 1, PROCEEDINGS, 2005, 3801 : 713 - 720
  • [10] Ontology-based text document clustering
    Staab, S
    Hotho, A
    [J]. INTELLIGENT INFORMATION PROCESSING AND WEB MINING, 2003, : 451 - 452