Adaptive encoding-based evolutionary approach for Chinese document clustering

被引:0
|
作者
Jun-Xian Chen
Yue-Jiao Gong
Wei-Neng Chen
Xiaolin Xiao
机构
[1] South China University of Technology,School of Computer Science and Engineering
[2] South China Normal University,School of Computer Science
来源
关键词
Adaptive encoding; Document clustering; Evolutionary approach; Single step of ; -means;
D O I
暂无
中图分类号
学科分类号
摘要
Document clustering has long been an important research direction in intelligent system. When being applied to process Chinese documents, new challenges were posted since it is infeasible to directly split the Chinese documents using the whitespace character. Moreover, many Chinese document clustering algorithms require prior knowledge of the cluster number, which is impractical to know in real-world applications. Considering these problems, we propose a general Chinese document clustering framework, where the main clustering task is fulfilled with an adaptive encoding-based evolutionary approach. Specifically, the adaptive encoding scheme is proposed to automatically learn the cluster number, and novel crossover and mutation operators are designed to fit this scheme. In addition, a single step of K-means is incorporated to conduct a joint global and local search, enhancing the overall exploitation ability. The experiments on benchmark datasets demonstrate the superiority of the proposed method in both the efficiency and the clustering precision.
引用
收藏
页码:3385 / 3398
页数:13
相关论文
共 50 条
  • [1] Adaptive encoding-based evolutionary approach for Chinese document clustering
    Chen, Jun-Xian
    Gong, Yue-Jiao
    Chen, Wei-Neng
    Xiao, Xiaolin
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (03) : 3385 - 3398
  • [2] An Evolutionary Approach for Document Clustering
    Akter, Ruksana
    Chung, Yoojin
    2013 INTERNATIONAL CONFERENCE ON ELECTRONIC ENGINEERING AND COMPUTER SCIENCE (EECS 2013), 2013, 4 : 370 - 375
  • [3] Hybrid Encoding-based Model for Chinese Reading Comprehension
    Yang, Zongyun
    Han, Shoukang
    Zha, Daren
    Tang, Zhihao
    2020 5TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (IEEE ICBDA 2020), 2020, : 174 - 179
  • [4] Towards an Adaptive Encoding for Evolutionary Data Clustering
    Shand, Cameron
    Allmendinger, Richard
    Handl, Julia
    Keane, John
    GECCO'18: PROCEEDINGS OF THE 2018 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2018, : 521 - 528
  • [5] On the Chinese document clustering based on dynamical term clustering
    Tseng, CM
    Tsai, KH
    Hsu, CC
    Chang, HC
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 534 - 539
  • [6] Adaptive subspace learning: an iterative approach for document clustering
    Xian Wu
    Xiaoming Chen
    Xiang Li
    Lingli Zhou
    Jianhuang Lai
    Neural Computing and Applications, 2014, 25 : 333 - 342
  • [7] Adaptive subspace learning: an iterative approach for document clustering
    Wu, Xian
    Chen, Xiaoming
    Li, Xiang
    Zhou, Lingli
    Lai, Jianhuang
    NEURAL COMPUTING & APPLICATIONS, 2014, 25 (02): : 333 - 342
  • [8] Multiple bit encoding-based search algorithms
    Zhao, XC
    Long, HL
    2005 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-3, PROCEEDINGS, 2005, : 1996 - 2001
  • [9] A typed model for encoding-based protocol interoperability
    Bradley, AD
    Bestavros, A
    Kfoury, AJ
    12TH IEEE INTERNATIONAL CONFERENCE ON NETWORK PROTOCOLS - PROCEEDINGS, 2004, : 72 - 83
  • [10] The arithmetic tie effect is mainly encoding-based
    Blankenberger, S
    COGNITION, 2001, 82 (01) : B15 - B24