Document clustering based on semantic smoothing approach

被引:1
|
作者
Liu, Yubao [1 ,2 ]
Cai, Jiarong [1 ]
Yin, Jian [1 ]
Huang, Zhilan [1 ]
机构
[1] Sun Yat Sen Univ, Dept Comp Sci, Guangzhou 510275, Peoples R China
[2] Sun Yat Sen Univ, Lab Informat, Guangzhou 510275, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
D O I
10.1007/978-3-540-72575-6_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering of text documents is an important data mining issue and has wide application fields. However, many clustering approaches fail to yield high clustering quality because of the complex document semantics. Recently, semantic smoothing, which has been widely studied in the field of Information Retrieval, is proposed as an efficient solution. However, the existing semantic smoothing methods are not effective for partitional clustering. In this paper, based on the principle of TF*IDF schema, we propose an improved semantic smoothing method which is suitable for both agglomerative and partitional clustering. The experimental results show our method is more effective than the previous methods in terms of cluster quality.
引用
收藏
页码:217 / +
页数:2
相关论文
共 50 条
  • [1] Semantic smoothing for model-based document clustering
    Zhang, Xiaodan
    Zhou, Xiaohua
    Hu, Xiaohua
    [J]. ICDM 2006: SIXTH INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2006, : 1193 - +
  • [2] Semantic Smoothing of Document Models for Agglomerative Clustering
    Zhou, Xiaohua
    Zhang, Xiaodan
    Hu, Xiaohua
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 2922 - 2927
  • [3] An improved semantic smoothing model for model-based document clustering
    Cai, Jiarong
    Liu, Yubao
    Yin, Jian
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 3, PROCEEDINGS, 2007, : 670 - +
  • [4] WordNet and Semantic Similarity based Approach for Document Clustering
    Desai, Sneha S.
    Laxminarayana, J. A.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMPUTATION SYSTEM AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTIONS (CSITSS), 2016, : 312 - 317
  • [5] A Survey of Document Clustering using Semantic Approach
    Saiyad, Nagma Y.
    Prajapati, Harshadkumar B.
    Dabhi, Vipul K.
    [J]. 2016 INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, AND OPTIMIZATION TECHNIQUES (ICEEOT), 2016, : 2555 - 2562
  • [6] A Latent Semantic Indexing-based approach to multilingual document clustering
    Wei, Chih-Ping
    Yang, Christopher C.
    Lin, Chia-Min
    [J]. DECISION SUPPORT SYSTEMS, 2008, 45 (03) : 606 - 620
  • [7] A Statistics-Based Semantic Relation Analysis Approach for Document Clustering
    Cheng, Xin
    Miao, Duoqian
    Wang, Lei
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2014, 2014, 8818 : 332 - 342
  • [8] An Efficient Document Clustering Approach for Devising Semantic Clusters
    Jasila, E. K.
    Saleena, N.
    Abdul Nazeer, K. A.
    [J]. CYBERNETICS AND SYSTEMS, 2023,
  • [9] Semantic smoothing for text clustering
    Nasir, Jamal A.
    Varlamis, Iraklis
    Karim, Asim
    Tsatsaronis, George
    [J]. KNOWLEDGE-BASED SYSTEMS, 2013, 54 : 216 - 229
  • [10] Document clustering by semantic smoothing and dynamic growing cell structure (DynGCS) for biomedical literature
    Song, Min
    Hu, Xiaohua
    Yoo, Illhoi
    Koppel, Eric
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2008, 5182 : 217 - +