Adaptive encoding-based evolutionary approach for Chinese document clustering

被引:0
|
作者
Jun-Xian Chen
Yue-Jiao Gong
Wei-Neng Chen
Xiaolin Xiao
机构
[1] South China University of Technology,School of Computer Science and Engineering
[2] South China Normal University,School of Computer Science
来源
关键词
Adaptive encoding; Document clustering; Evolutionary approach; Single step of ; -means;
D O I
暂无
中图分类号
学科分类号
摘要
Document clustering has long been an important research direction in intelligent system. When being applied to process Chinese documents, new challenges were posted since it is infeasible to directly split the Chinese documents using the whitespace character. Moreover, many Chinese document clustering algorithms require prior knowledge of the cluster number, which is impractical to know in real-world applications. Considering these problems, we propose a general Chinese document clustering framework, where the main clustering task is fulfilled with an adaptive encoding-based evolutionary approach. Specifically, the adaptive encoding scheme is proposed to automatically learn the cluster number, and novel crossover and mutation operators are designed to fit this scheme. In addition, a single step of K-means is incorporated to conduct a joint global and local search, enhancing the overall exploitation ability. The experiments on benchmark datasets demonstrate the superiority of the proposed method in both the efficiency and the clustering precision.
引用
收藏
页码:3385 / 3398
页数:13
相关论文
共 50 条
  • [21] A soft encoding-based evolutionary algorithm for the steelmaking scheduling problem and its extension under energy thresholds
    Jiang, Sheng-Long
    COMPUTERS & OPERATIONS RESEARCH, 2025, 174
  • [22] Document clustering based on semantic smoothing approach
    Liu, Yubao
    Cai, Jiarong
    Yin, Jian
    Huang, Zhilan
    ADVANCES IN INTELLIGENT WEB MASTERING, 2007, 43 : 217 - +
  • [23] Freeman encoding-based line-segments recognition
    Wang, Ping
    Dong, Yude
    Luo, Zheshuai
    Jisuanji Gongcheng/Computer Engineering, 2005, 31 (10): : 171 - 173
  • [24] A layer-based Chinese document image encoding method
    Huang, Xianglin
    Fu, Min
    Yang, Zhao
    Zhang, Hui
    Zhuang, Pengzhou
    Lv, Rui
    WMSCI 2007: 11TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS, 2007, : 233 - 238
  • [25] Encoding-Based Machine Learning Approach for Health Status Classification and Remote Monitoring of Cardiac Patients
    Awad, Sohaib R.
    Alghareb, Faris S.
    ALGORITHMS, 2025, 18 (02)
  • [26] A clustering algorithm based on elitist evolutionary approach
    Boudjeloud-Assala, Lydia
    Ta Minh Thuy
    INTERNATIONAL JOURNAL OF BIO-INSPIRED COMPUTATION, 2017, 10 (04) : 258 - 266
  • [27] A hybrid evolutionary computation approach with its application for optimizing text document clustering
    Song, Wei
    Qiao, Yingying
    Park, Soon Cheol
    Qian, Xuezhong
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) : 2517 - 2524
  • [28] Encoding-based Range Detection in Commodity RFID Systems
    Yu, Xi
    Liu, Jia
    Zhang, Shigeng
    Chen, Xingyu
    Zhang, Xu
    Chen, Lijun
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2022), 2022, : 680 - 689
  • [29] Web document ensemble clustering based on adaptive resonance theory
    Yang, Yan
    Jin, Fan
    Kamel, Mohamed
    Xinan Jiaotong Daxue Xuebao/Journal of Southwest Jiaotong University, 2009, 44 (01): : 26 - 31
  • [30] Ontology Based Document Clustering - An Efficient Hybrid Approach
    Jasila, E. K.
    Saleena, N.
    Nazeer, Abdul K. A.
    PROCEEDINGS OF THE 2019 IEEE 9TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC 2019), 2019, : 153 - 157