Adaptive encoding-based evolutionary approach for Chinese document clustering

被引:0
|
作者
Jun-Xian Chen
Yue-Jiao Gong
Wei-Neng Chen
Xiaolin Xiao
机构
[1] South China University of Technology,School of Computer Science and Engineering
[2] South China Normal University,School of Computer Science
来源
关键词
Adaptive encoding; Document clustering; Evolutionary approach; Single step of ; -means;
D O I
暂无
中图分类号
学科分类号
摘要
Document clustering has long been an important research direction in intelligent system. When being applied to process Chinese documents, new challenges were posted since it is infeasible to directly split the Chinese documents using the whitespace character. Moreover, many Chinese document clustering algorithms require prior knowledge of the cluster number, which is impractical to know in real-world applications. Considering these problems, we propose a general Chinese document clustering framework, where the main clustering task is fulfilled with an adaptive encoding-based evolutionary approach. Specifically, the adaptive encoding scheme is proposed to automatically learn the cluster number, and novel crossover and mutation operators are designed to fit this scheme. In addition, a single step of K-means is incorporated to conduct a joint global and local search, enhancing the overall exploitation ability. The experiments on benchmark datasets demonstrate the superiority of the proposed method in both the efficiency and the clustering precision.
引用
收藏
页码:3385 / 3398
页数:13
相关论文
共 50 条
  • [41] Cycle encoding-based parameter synthesis for timed automata safety
    Sucu, Burkay
    Gol, Ebru Aydin
    ACTA INFORMATICA, 2024, 61 (04) : 333 - 356
  • [42] Elementary encoding by evolutionary approach
    Vasyltsov, I
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2003, PT 1, PROCEEDINGS, 2003, 2667 : 282 - 290
  • [43] Adaptive Centroid-based Clustering Algorithm for Text Document Data
    Li, Ximing
    Ouyang, Jihong
    Zhou, Xiaotang
    Fu, Bo
    2014 SIXTH INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS AND PROGRAMMING (PAAP), 2014, : 63 - 68
  • [44] An incremental document clustering algorithm based on a hierarchical agglomerative approach
    Joo, KH
    Lee, SJ
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, PROCEEDINGS, 2005, 3816 : 321 - 332
  • [45] A Novel Graph Based Clustering Approach to Document Topic Modeling
    Chanda, Prateek
    Das, Asit Kumar
    2018 9TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2018,
  • [46] Frequent Term Based Text Document Clustering: A New Approach
    Kumar, Manoj
    Yadav, D. K.
    Gupta, Vijay Kumar
    2015 INTERNATIONAL CONFERENCE ON SOFT COMPUTING TECHNIQUES AND IMPLEMENTATIONS (ICSCTI), 2015,
  • [47] A collaborative filtering-based approach to personalized document clustering
    Wei, Chih-Ping
    Yang, Chin-Sheng
    Hsiao, Han-Wei
    DECISION SUPPORT SYSTEMS, 2008, 45 (03) : 413 - 428
  • [48] Administrative Document Segmentation Based on Texture Approach and Fuzzy Clustering
    Zaaboub, Wala
    Tlig, Lotfi
    Sayadi, Mounir
    2016 SECOND INTERNATIONAL IMAGE PROCESSING, APPLICATIONS AND SYSTEMS (IPAS), 2016,
  • [49] A novel approach of clustering XML document based on path optimization
    Yang, Houqun
    He, Zhongshi
    Lei, Jingsheng
    Yu, Lei
    Journal of Computational Information Systems, 2007, 3 (05): : 2069 - 2074
  • [50] SUPPORTING DOCUMENT-CATEGORY MANAGEMENT: AN ONTOLOGY-BASED DOCUMENT CLUSTERING APPROACH
    Lee, Yen-Hsien
    Tu, Ching-Yi
    12TH PACIFIC ASIA CONFERENCE ON INFORMATION SYSTEMS (PACIS 2008), 2008, : 1457 - 1468