Hybrid Feature Selection Based on Principal Component Analysis and Grey Wolf Optimizer Algorithm for Arabic News Article Classification

被引:4
|
作者
Alomari, Osama Ahmad [1 ]
Elnagar, Ashraf [2 ]
Afyouni, Imad [2 ]
Shahin, Ismail [3 ]
Nassif, Ali Bou [4 ]
Hashem, Ibrahim Abaker [2 ]
Tubishat, Mohammad [5 ]
机构
[1] Univ Sharjah, MLALP Res Grp, Sharjah, U Arab Emirates
[2] Univ Sharjah, Dept Comp Sci, Sharjah, U Arab Emirates
[3] Univ Sharjah, Dept Elect Engn, Sharjah, U Arab Emirates
[4] Univ Sharjah, Dept Comp Engn, Sharjah, U Arab Emirates
[5] Zayed Univ, Coll Technol Innovat, Abu Dhabi, U Arab Emirates
关键词
Arabic text classification; feature selection; grey wolf optimizer; principal component analysis; logistic regression; TEXT CATEGORIZATION; GENETIC ALGORITHM; IDENTIFICATION; MODEL; SYSTEM; WORD;
D O I
10.1109/ACCESS.2022.3222516
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid growth of electronic documents has resulted from the expansion and development of internet technologies. Text-documents classification is a key task in natural language processing that converts unstructured data into structured form and then extract knowledge from it. This conversion generates a high dimensional data that needs further analysis using data mining techniques like feature extraction, feature selection, and classification to derive meaningful insights from the data. Feature selection is a technique used for reducing dimensionality in order to prune the feature space and, as a result, lowering the computational cost and enhancing classification accuracy. This work presents a hybrid filter-wrapper method based on Principal Component Analysis (PCA) as a filter approach to select an appropriate and informative subset of features and Grey Wolf Optimizer (GWO) as wrapper approach (PCA-GWO) to select further informative features. Logistic Regression (LR) is used as an elevator to test the classification accuracy of candidate feature subsets produced by GWO. Three Arabic datasets, namely Alkhaleej, Akhbarona, and Arabiya, are used to assess the efficiency of the proposed method. The experimental results confirm that the proposed method based on PCA-GWO outperforms the baseline classifiers with/without feature selection and other feature selection approaches in terms of classification accuracy.
引用
收藏
页码:121816 / 121830
页数:15
相关论文
共 50 条
  • [1] Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification
    Chantar, Hamouda
    Mafarja, Majdi
    Alsawalqah, Hamad
    Heidari, Ali Asghar
    Aljarah, Ibrahim
    Faris, Hossam
    [J]. NEURAL COMPUTING & APPLICATIONS, 2020, 32 (16): : 12201 - 12220
  • [2] Feature selection using binary grey wolf optimizer with elite-based crossover for Arabic text classification
    Hamouda Chantar
    Majdi Mafarja
    Hamad Alsawalqah
    Ali Asghar Heidari
    Ibrahim Aljarah
    Hossam Faris
    [J]. Neural Computing and Applications, 2020, 32 : 12201 - 12220
  • [3] Hybrid Gradient Descent Grey Wolf Optimizer for Optimal Feature Selection
    Kitonyi, Peter Mule
    Segera, Davies Rene
    [J]. BIOMED RESEARCH INTERNATIONAL, 2021, 2021
  • [4] Hybrid Binary Grey Wolf With Harris Hawks Optimizer for Feature Selection
    Al-Wajih, Ranya
    Abdulkadir, Said Jadid
    Aziz, Norshakirah
    Al-Tashi, Qasem
    Talpur, Noureen
    [J]. IEEE ACCESS, 2021, 9 : 31662 - 31677
  • [5] Tool wear condition recognition based on kernel principal component and grey wolf optimizer algorithm
    基于核主成分和灰狼优化算法的刀具磨损状态识别
    [J]. Lu, Juan (lujuan3623366@163.com), 1600, CIMS (26): : 3031 - 3039
  • [6] A hybrid bat and grey wolf optimizer for gene selection in cancer classification
    Tbaishat, Dina
    Tubishat, Mohammad
    Makhadmeh, Sharif Naser
    Alomari, Osama Ahmad
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024,
  • [7] Binary Multi-Objective Grey Wolf Optimizer for Feature Selection in Classification
    Al-Tashi, Qasem
    Abdulkadir, Said Jadid
    Rais, Helmi Md
    Mirjalili, Seyedali
    Alhussian, Hitham
    Ragab, Mohammed G.
    Alqushaibi, Alawi
    [J]. IEEE Access, 2020, 8 : 106247 - 106263
  • [8] Binary Multi-Objective Grey Wolf Optimizer for Feature Selection in Classification
    Al-Tashi, Qasem
    Abdulkadir, Said Jadid
    Rais, Helmi Md
    Mirjalili, Seyedali
    Alhussian, Hitham
    Ragab, Mohammed G.
    Alqushaibi, Alawi
    [J]. IEEE ACCESS, 2020, 8 : 106247 - 106263
  • [9] S-shaped grey wolf optimizer-based FOX algorithm for feature selection
    Feda, Afi Kekeli
    Adegboye, Moyosore
    Adegboye, Oluwatayomi Rereloluwa
    Agyekum, Ephraim Bonah
    Mbasso, Wulfran Fendzi
    Kamel, Salah
    [J]. HELIYON, 2024, 10 (02)
  • [10] A Novel Hybrid Algorithm Based on Grey Wolf Optimizer and Fireworks Algorithm
    Yue, Zhihang
    Zhang, Sen
    Xiao, Wendong
    [J]. SENSORS, 2020, 20 (07)