Adaptive encoding-based evolutionary approach for Chinese document clustering

被引:0
|
作者
Jun-Xian Chen
Yue-Jiao Gong
Wei-Neng Chen
Xiaolin Xiao
机构
[1] South China University of Technology,School of Computer Science and Engineering
[2] South China Normal University,School of Computer Science
来源
关键词
Adaptive encoding; Document clustering; Evolutionary approach; Single step of ; -means;
D O I
暂无
中图分类号
学科分类号
摘要
Document clustering has long been an important research direction in intelligent system. When being applied to process Chinese documents, new challenges were posted since it is infeasible to directly split the Chinese documents using the whitespace character. Moreover, many Chinese document clustering algorithms require prior knowledge of the cluster number, which is impractical to know in real-world applications. Considering these problems, we propose a general Chinese document clustering framework, where the main clustering task is fulfilled with an adaptive encoding-based evolutionary approach. Specifically, the adaptive encoding scheme is proposed to automatically learn the cluster number, and novel crossover and mutation operators are designed to fit this scheme. In addition, a single step of K-means is incorporated to conduct a joint global and local search, enhancing the overall exploitation ability. The experiments on benchmark datasets demonstrate the superiority of the proposed method in both the efficiency and the clustering precision.
引用
收藏
页码:3385 / 3398
页数:13
相关论文
共 50 条
  • [31] WordNet and Semantic Similarity based Approach for Document Clustering
    Desai, Sneha S.
    Laxminarayana, J. A.
    2016 INTERNATIONAL CONFERENCE ON COMPUTATION SYSTEM AND INFORMATION TECHNOLOGY FOR SUSTAINABLE SOLUTIONS (CSITSS), 2016, : 312 - 317
  • [32] Phrase Based Web Document Clustering: An Indexing Approach
    Singh, Amit Prakash
    Srivastava, Shalini
    Sahu, Sanjib Kumar
    COMPUTER COMMUNICATION, NETWORKING AND INTERNET SECURITY, 2017, 5 : 481 - 492
  • [33] Application of fuzzy clustering algorithm in Chinese document clustering
    Li, Jiafu
    Zhang, Yafei
    Lu, Jianjiang
    Jisuanji Gongcheng/Computer Engineering, 2002, 28 (04):
  • [34] Chinese multi-document summarization using adaptive clustering and global search strategy
    Liu, Dexi
    He, Yanxiang
    Ji, Donghong
    Yang, Hua
    Wu, Zhao
    PRICAI 2006: TRENDS IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4099 : 1135 - 1139
  • [35] Binary encoding-based morpheme boundary detection of Dogri language
    Gupta, Parul
    Jamwal, Shubhnandan S.
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2025, 13 (01)
  • [36] A Novel Hybrid Clustering Approach Based on Black Hole Algorithm for Document Clustering
    Malik, Fazila
    Khan, Salabat
    Rizwan, Atif
    Atteia, Ghada
    Samee, Nagwan Abdel
    IEEE ACCESS, 2022, 10 : 97310 - 97326
  • [37] A Novel Hybrid Clustering Approach Based on Black Hole Algorithm for Document Clustering
    Malik, Fazila
    Khan, Salabat
    Rizwan, Atif
    Atteia, Ghada
    Samee, Nagwan Abdel
    IEEE Access, 2022, 10 : 97310 - 97326
  • [38] Hybrid Low Radix Encoding-Based Approximate Booth Multipliers
    Waris, Haroon
    Wang, Chenghua
    Liu, Weiqiang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (12) : 3367 - 3371
  • [39] Autoencoder and Masked Image Encoding-Based Attentional Pose Network
    Hu, Longhua
    Ma, Xiaoliang
    He, Cheng
    Wang, Lei
    Cheng, Jun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT II, 2024, 14426 : 221 - 233
  • [40] HiNeRV: Video Compression with Hierarchical Encoding-based Neural Representation
    Kwan, Ho Man
    Gao, Ge
    Zhang, Fan
    Gower, Andrew
    Bull, David
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,