The Effect of Stemming on Arabic Text Classification: An Empirical Study

被引:23
|
作者
Wahbeh, Abdullah [1 ]
Al-Kabi, Mohammed [1 ]
Al-Radaideh, Qasem [1 ]
Al-Shawakfa, Emad [1 ]
Alsmadi, Izzat [1 ]
机构
[1] Yarmouk Univ, Irbid, Jordan
关键词
Arabic Text Classification; Decision Tree; Naive Bayes Classifier (NB); Natural Language Processing; Stemming; Support Vector Machine (SVM); Text Classification;
D O I
10.4018/ijirr.2011070104
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The information world is rich of documents in different formats or applications, such as databases, digital libraries, and the Web. Text classification is used for aiding search functionality offered by search engines and information retrieval systems to deal with the large number of documents on the web. Many research papers, conducted within the field of text classification, were applied to English, Dutch, Chinese, and other languages, whereas fewer were applied to Arabic language. This paper addresses the issue of automatic classification or classification of Arabic text documents. It applies text classification to Arabic language text documents using stemming as part of the preprocessing steps. Results have showed that applying text classification without using stemming; the support vector machine (SVM) classifier has achieved the highest classification accuracy using the two test modes with 87.79% and 88.54%. On the other hand, stemming has negatively affected the accuracy, where the SVM accuracy using the two test modes dropped down to 84.49% and 86.35%.
引用
收藏
页码:54 / 70
页数:17
相关论文
共 50 条
  • [1] The Effect of using Light Stemming for Arabic Text Classification
    Atwan, Jaffar
    Wedyan, Mohammad
    Bsoul, Qusay
    Hamadeen, Ahmad
    Alturki, Ryan
    Ikram, Mohammed
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (05) : 768 - 773
  • [2] The Use of Stemming in the Arabic Text and Its Impact on the Accuracy of Classification
    Atwan, Jaffar
    Wedyan, Mohammad
    Bsoul, Qusay
    Hammadeen, Ahmad
    Alturki, Ryan
    [J]. SCIENTIFIC PROGRAMMING, 2021, 2021
  • [3] Effect of Stemming on Hindi Text Classification
    Pimpalshende, Anjusha
    Singh, Preety
    Potnurwar, Archana
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2023, 14 (01): : 208 - 215
  • [4] Empirical evaluation and study of text stemming algorithms
    Jabbar, Abdul
    Iqbal, Sajid
    Tamimy, Manzoor Ilahi
    Hussain, Shafiq
    Akhunzada, Adnan
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2020, 53 (08) : 5559 - 5588
  • [5] Empirical evaluation and study of text stemming algorithms
    Abdul Jabbar
    Sajid Iqbal
    Manzoor Ilahi Tamimy
    Shafiq Hussain
    Adnan Akhunzada
    [J]. Artificial Intelligence Review, 2020, 53 : 5559 - 5588
  • [6] Effect of stemming on text similarity for Arabic language at sentence level
    Alhawarat, Mohammad O.
    Abdeljaber, Hikmat
    Hilal, Anwer
    [J]. PEERJ COMPUTER SCIENCE, 2021,
  • [7] Effect of Stemming on Text Similarity for Arabic Language at Sentence Level
    Alhawarat, Mohammad O.
    Abdeljaber, Hikmat
    Hilal, Anwer
    [J]. PeerJ Computer Science, 2021, 7 : 1 - 18
  • [8] Impact of stemming on Arabic text summarization
    Alami, Nabil
    Meknassi, Mohammed
    Ouatik, Said Alaoui
    Ennahnahi, NourEddine
    [J]. 2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 338 - 343
  • [9] A Study of the Effects of Stemming Strategies on Arabic Document Classification
    Alhaj, Yousif A.
    Xiang, Jianwen
    Zhao, Dongdong
    Al-Qaness, Mohammed A. A.
    Abd Elaziz, Mohamed
    Dahou, Abdelghani
    [J]. IEEE ACCESS, 2019, 7 : 32664 - 32671
  • [10] Arabic Text Stemming: Comparative Analysis.
    Mamoun, Rasha
    Ahmed, Mahmoud
    [J]. 2016 CONFERENCE OF BASIC SCIENCES AND ENGINEERING STUDIES (SCGAC), 2016, : 88 - 93