An Evolutionary Algorithm-Based Text Categorization Technique

被引:1
|
作者
Das, Ajit Kumar [1 ]
Das, Asit Kumar [1 ]
Sarkar, Apurba [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Howrah 711103, W Bengal, India
关键词
Text mining; Feature selection; Text clustering; Cluster validation; Multi-objective evolutionary algorithm;
D O I
10.1007/978-981-10-8055-5_75
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In general, most of the organizations generate unstructured data from which extraction of meaningful information becomes a difficult task. Preprocessing of unstructured data before mining helps to improve the efficiency of the mining algorithms. In this paper, text data is initially preprocessed using tokenization, stop word removal, and stemming operations and a bag-of-words is identified to characterize the text dataset. Next, improved strength pareto evolutionary algorithm-based genetic algorithm is applied to determine the more compact set of informative words for clustering of text documents efficiently. It is a bi-objective genetic algorithm used to approximate the pareto-optimal front exploring the search space for optimal solution. The external clustering index and number of words described in the documents are considered as two objective functions of the algorithm, and based on these functions chromosomes in the population are evaluated and the best chromosome in non dominated pareto front of final population gives the optimal set of words sufficient for categorizartion of text dataset.
引用
收藏
页码:851 / 861
页数:11
相关论文
共 50 条
  • [31] The Research of kNN Text Categorization Algorithm Based On Eager Learning
    Dong, Tao
    Cheng, Weinan
    Shang, Wenqian
    [J]. 2012 INTERNATIONAL CONFERENCE ON INDUSTRIAL CONTROL AND ELECTRONICS ENGINEERING (ICICEE), 2012, : 1120 - 1123
  • [32] Evolutionary Algorithm-based Feature Selection for an Intrusion Detection System
    Singh, Devendra Kumar
    Shrivastava, Manish
    [J]. ENGINEERING TECHNOLOGY & APPLIED SCIENCE RESEARCH, 2021, 11 (03) : 7130 - 7134
  • [33] Evolutionary Algorithm-Based Background Generation for Robust Object Detection
    Kim, Taekyung
    Lee, Seongwon
    Paik, Joonki
    [J]. INTELLIGENT COMPUTING, PART I: INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING, ICIC 2006, PART I, 2006, 4113 : 542 - 552
  • [34] Evolutionary Algorithm-based Space Diversity for Imperfect Channel Estimation
    Ghadiri, Zienab Pouladmast
    El-Saleh, Ayman A.
    Vetharatnam, Gobi
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2014, 8 (05): : 1588 - 1603
  • [35] Curbing Pandemic Through Evolutionary Algorithm-Based Priority Aware
    Zahran, Sherif R.
    Moscato, Stefano
    Fonte, Alessandro
    Oldoni, Matteo
    Traversa, Antonio A.
    Tresoldi, Dario
    Ferrari, Philippe
    Amendola, Giandomenico
    Boccia, Luigi
    [J]. IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2023, 71 (06) : 2582 - 2593
  • [36] Evolutionary Algorithm-Based Error Parameterization Methods for Data Assimilation
    Bai, Yulong
    Li, Xin
    [J]. MONTHLY WEATHER REVIEW, 2011, 139 (08) : 2668 - 2685
  • [37] Quantum-inspired evolutionary algorithm-based face verification
    Jang, JS
    Han, KH
    Kim, JH
    [J]. GENETIC AND EVOLUTIONARY COMPUTATION - GECCO 2003, PT II, PROCEEDINGS, 2003, 2724 : 2147 - 2156
  • [38] Evolutionary Algorithm-based Parameter Identification for Nonlinear Dynamical Systems
    Banerjee, Amit
    Abu-Mahfouz, Issam
    [J]. 2011 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2011, : 1 - 5
  • [39] An Evolutionary Algorithm-Based PWM Strategy for a Hybrid Power Converter
    Rodriguez, Alma
    Alejo-Reyes, Avelina
    Cuevas, Erik
    Beltran-Carbajal, Francisco
    Rosas-Caro, Julio C.
    [J]. MATHEMATICS, 2020, 8 (08)
  • [40] Epigenetic Algorithm-Based Detection Technique for Network Attacks
    Ezzarii, Mehdi
    El Ghazi, Hamid
    El Ghazi, Hassan
    El Bouanani, Faissal
    [J]. IEEE ACCESS, 2020, 8 : 199482 - 199491