Unsupervised document clustering based on keyword clusters

被引:0
|
作者
Chang, HC [1 ]
Hsu, CC [1 ]
Deng, YW [1 ]
机构
[1] Hwa Hsia Coll Technol & Commerce, Dept Elect Engn, Taipei 235, Taiwan
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the explosion growth of digital information, automatic document clustering or categorization has been an important research topic. Since document clustering has high dimension, the magnitude of the representation features will influence the efficiency and effect of clustering proceeding and precision of clustering results. This paper presents an unsupervised document clustering method based on partitioning a weighted undirected graph. It initially discovers a set of tightly relevant keyword clusters that are disposed throughout the feature space of the collection of documents, and further cluster the documents into document clusters by using these keyword clusters. The experimental results show that the proposed approach can efficiently produce higher quality document clustering as compared with several well-known document clustering algorithms.
引用
收藏
页码:1198 / 1203
页数:6
相关论文
共 50 条
  • [1] Using topic keyword clusters for automatic document clustering
    Chang, HC
    Hsu, CC
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (08) : 1852 - 1860
  • [2] Using topic keyword clusters for automatic document clustering
    Chang, HC
    Hsu, CC
    THIRD INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 1, PROCEEDINGS, 2005, : 419 - 424
  • [3] Automatic document clustering based on keyword clusters using partitions of weighted diagraphs
    Chang, HC
    Hsu, CC
    Chan, CK
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2004, 19 (01): : 27 - 37
  • [4] Automatic document clustering based on keyword clusters using partitions of weighted digraphs
    Chang, HC
    Hsu, CC
    Chan, CK
    IKE'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2003, : 646 - 651
  • [5] Unsupervised Keyword Extraction Method based on Chinese Patent Clustering
    Xie, Yuxin
    Hu, Xuegang
    Zhang, Yuhong
    Li, Shi
    2019 10TH IEEE INTERNATIONAL CONFERENCE ON BIG KNOWLEDGE (ICBK 2019), 2019, : 302 - 309
  • [6] Hybrid distance based document clustering with keyword and phrase indexing
    Subhadra, K.
    Shashi, M.
    International Journal of Computer Science Issues, 2012, 9 (02): : 345 - 350
  • [7] Multi-document summarization based on unsupervised clustering
    Ji, Paul
    INFORMATION RETRIEVAL TECHNOLOLGY, PROCEEDINGS, 2006, 4182 : 560 - 566
  • [8] Keyword Extraction and Clustering for Document Recommendation in Conversations
    Habibi, Maryam
    Popescu-Belis, Andrei
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (04) : 746 - 759
  • [9] Autoencoding Keyword Correlation Graph for Document Clustering
    Chiu, Billy
    Sahu, Sunil Kumar
    Thomas, Derek
    Sengupta, Neha
    Mahdy, Mohammady
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 3974 - 3981
  • [10] An unsupervised language model adaptation based on keyword clustering and query availability estimation
    Ito, Akinori
    Kajiura, Yasutomo
    Makino, Shozo
    Suzuki, Motoyuki
    2008 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING, VOLS 1 AND 2, PROCEEDINGS, 2008, : 1412 - 1418