Arabic Text Stemming: Comparative Analysis.

被引:0
|
作者
Mamoun, Rasha [1 ]
Ahmed, Mahmoud [1 ]
机构
[1] Univ Khartoum, Fac Math Sci, Khartoum, Sudan
关键词
Arabic text; stemming; stemming algorithms; Light Stemming; Khoja stemming; data preprocess; Extended Stop words File (ESWF);
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Text classification is the most important research issues in the field of data mining. The main idea of using the stemming technique is to reduce the number of features that can be extracted from the document. Furthermore, the stemming aims to enhance the accuracy of the classifier. This paper aims to study the effectiveness of using stemming techniques. The paper will use two popular word extractions: Khoja and Light stemmers. The results will compare with the result of classification without using the technique of word extraction. In the experiment, the Sequential Minimal Optimization (SMO), Naive Bayesian (NB) J48 and K-nearest neighbors (KNN) were used to build the training models and test the data. By implement the two approaches of word extraction and measured the accuracy of them by precision, recall and f-measure, the results show that the Light stemmers outperforms the Khoja stemmer. Furthermore, the results were comparing with the results of classification without using stemming technique.
引用
收藏
页码:88 / 93
页数:6
相关论文
共 50 条
  • [1] Impact of stemming on Arabic text summarization
    Alami, Nabil
    Meknassi, Mohammed
    Ouatik, Said Alaoui
    Ennahnahi, NourEddine
    [J]. 2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 338 - 343
  • [2] A Comparative Analysis of Arabic Text Steganography
    Thabit, Reema
    Udzir, Nur Izura
    Yasin, Sharifah Md
    Asmawi, Aziah
    Roslan, Nuur Alifah
    Din, Roshidi
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (15):
  • [3] Stemming versus light stemming as feature selection techniques for Arabic text categorization
    Duwairi, Rehab
    Al-Refai, Mohammad
    Khasawneh, Natheer
    [J]. 2007 INNOVATIONS IN INFORMATION TECHNOLOGIES, VOLS 1 AND 2, 2007, : 199 - 203
  • [4] A New and Efficient Stemming Technique for Arabic Text Categorization
    Hadni, M.
    Lachkar, A.
    Alaoui Ouatik, S.
    [J]. 2012 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2012, : 791 - 796
  • [5] The Effect of Stemming on Arabic Text Classification: An Empirical Study
    Wahbeh, Abdullah
    Al-Kabi, Mohammed
    Al-Radaideh, Qasem
    Al-Shawakfa, Emad
    Alsmadi, Izzat
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2011, 1 (03) : 54 - 70
  • [6] Stemming Impact on Arabic Text Categorization Performance: a Survey
    Al-Anzi, Fawaz S.
    AbuZeina, Dia
    [J]. 2015 5TH INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND ACCESSIBILITY (ICTA), 2015,
  • [7] Integrating Effective Rules to Improve Arabic Text Stemming
    Cherif, Walid
    Madani, Abdellah
    Kissi, Mohamed
    [J]. 2014 INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS (ICMCS), 2014, : 1077 - 1081
  • [8] Arabic Text Stemming Using Query Expansion Method
    Yusuf, Nuhu
    Yunus, Mohd Amin Mohd
    Wahid, Norfaradilla
    [J]. EMERGING TRENDS IN INTELLIGENT COMPUTING AND INFORMATICS: DATA SCIENCE, INTELLIGENT INFORMATION SYSTEMS AND SMART COMPUTING, 2020, 1073 : 3 - 11
  • [9] The Effect of using Light Stemming for Arabic Text Classification
    Atwan, Jaffar
    Wedyan, Mohammad
    Bsoul, Qusay
    Hamadeen, Ahmad
    Alturki, Ryan
    Ikram, Mohammed
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (05) : 768 - 773
  • [10] Effect of stemming on text similarity for Arabic language at sentence level
    Alhawarat, Mohammad O.
    Abdeljaber, Hikmat
    Hilal, Anwer
    [J]. PEERJ COMPUTER SCIENCE, 2021,