An Evolutionary Algorithm-Based Text Categorization Technique

被引:1
|
作者
Das, Ajit Kumar [1 ]
Das, Asit Kumar [1 ]
Sarkar, Apurba [1 ]
机构
[1] Indian Inst Engn Sci & Technol, Dept Comp Sci & Technol, Howrah 711103, W Bengal, India
关键词
Text mining; Feature selection; Text clustering; Cluster validation; Multi-objective evolutionary algorithm;
D O I
10.1007/978-981-10-8055-5_75
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In general, most of the organizations generate unstructured data from which extraction of meaningful information becomes a difficult task. Preprocessing of unstructured data before mining helps to improve the efficiency of the mining algorithms. In this paper, text data is initially preprocessed using tokenization, stop word removal, and stemming operations and a bag-of-words is identified to characterize the text dataset. Next, improved strength pareto evolutionary algorithm-based genetic algorithm is applied to determine the more compact set of informative words for clustering of text documents efficiently. It is a bi-objective genetic algorithm used to approximate the pareto-optimal front exploring the search space for optimal solution. The external clustering index and number of words described in the documents are considered as two objective functions of the algorithm, and based on these functions chromosomes in the population are evaluated and the best chromosome in non dominated pareto front of final population gives the optimal set of words sufficient for categorizartion of text dataset.
引用
收藏
页码:851 / 861
页数:11
相关论文
共 50 条
  • [1] Genetic algorithm-based text clustering technique
    Song, Wei
    Park, Soon Cheol
    [J]. ADVANCES IN NATURAL COMPUTATION, PT 1, 2006, 4221 : 779 - 782
  • [2] Text categorization based on granular agent evolutionary classification algorithm
    Pan X.
    Chen H.
    Jing Z.
    [J]. Journal of Computational and Theoretical Nanoscience, 2016, 13 (02) : 1391 - 1398
  • [3] An Evolutionary Algorithm-Based Vehicular Clustering Technique for VANETs
    Shah, Yaser Ali
    Aadil, Farhan
    Khalil, Amaad
    Assam, Muhammad
    Abunadi, Ibrahim
    Alluhaidan, Ala Saleh
    Al-Wesabi, Fahd N.
    [J]. IEEE ACCESS, 2022, 10 : 14368 - 14385
  • [4] EVOLUTIONARY ALGORITHM-BASED TECHNIQUE FOR POWER SYSTEM SECURITY ENHANCEMENT
    Rambabu, Chunduri
    Obulesu, Y. P.
    Saibabu, Ch
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRICAL ENGINEERING (ICAEE), 2014,
  • [5] A KNN BASED ALGORITHM FOR TEXT CATEGORIZATION
    Bucar, Joze
    Povh, Janez
    [J]. SOR'13 PROCEEDINGS: THE 12TH INTERNATIONAL SYMPOSIUM ON OPERATIONAL RESEARCH IN SLOVENIA, 2013, : 367 - 372
  • [6] Evolutionary algorithm-based face verification
    Jang, JS
    Han, KH
    Kim, JH
    [J]. PATTERN RECOGNITION LETTERS, 2004, 25 (16) : 1857 - 1865
  • [7] The evolutionary algorithm-based reasoning system
    Yasunaga, M
    Yoshihara, I
    Kim, JH
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2001, E84D (11) : 1508 - 1520
  • [8] An evolutionary lion optimization algorithm-based image compression technique for biomedical applications
    Geetha, Karuppaiah
    Anitha, Veerasamy
    Elhoseny, Mohamed
    Kathiresan, Shankar
    Shamsolmoali, Pourya
    Selim, Mahmoud M.
    [J]. EXPERT SYSTEMS, 2021, 38 (01)
  • [9] An improved text categorization algorithm based on VSM
    Geng, Ji
    Lu, Yunling
    Chen, Wei
    Qin, Zhiguang
    [J]. 2014 IEEE 17TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, : 1701 - 1706
  • [10] Intelligence text categorization based on Bayes algorithm
    Yu, F
    An, JY
    Li, H
    Zhu, ML
    Yang, OY
    [J]. ICIA 2004: Proceedings of 2004 International Conference on Information Acquisition, 2004, : 347 - 350