Arabic light-based stemming: a comparative study among ligh10 stemmer, P-stemmer, and Conditional light stemmer

被引:0
|
作者
Hussien, Sabria Mohammed [1 ]
Aburagheef, Hazim J. [1 ]
机构
[1] Univ Babylon, Dept Software, IT Coll, Karbala, Iraq
关键词
Arabic language; natural language processing; text classification; Arabic stemming;
D O I
10.1109/IT-ELA52201.2021.9773743
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Arabic stemming is a key stage in natural language processing's preprocessing (NLP). It takes affixes out of words. It improves text classification (TC) as well as information retrieval (IR). Light-based stemming and root-based stemming are the two types of stem. When compared to root-based stemming, light-based stemming consumes more energy. Only suffixes and prefixes are removed from the words. The light10 stemmer, the p-stemmer, and conditional light stemming (CondLight) are three well-known methods of light stemming. Prefixes and suffixes are removed by Light10 stemmers under a few conditions. Only prefixes are removed by the P-stemmer, while the CondLight stemmer is the same as the Light10 stemmer but with eight conditions. We measured the extent of improvement in Arabic TC by evaluating the stemmers. Three classifiers employ the Support Vector Machine (SVM), the k-nearest neighbor algorithm (KNN), Nave Bays (NB), and statistical similarity measurement. With stemming, the outcome indicates a small improvement (about 2 percent improvement).
引用
收藏
页码:131 / 135
页数:5
相关论文
共 7 条
  • [1] Arabic Light Stemming: A Comparative Study between P-Stemmer, Khoja Stemmer, and Light10 Stemmer
    Kanan, Tarek
    Sadaqa, Odai
    Almhirat, Ashraf
    Kanan, Emran
    [J]. 2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2019, : 511 - 515
  • [2] Conditional Arabic Light Stemmer: CondLight
    Al-Lahham, Yaser
    Matarneh, Khawlah
    Hassan, Mohammad
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (3A) : 559 - 564
  • [3] Arabic light-based stemmer using new rules
    Alshalabi, Hamood
    Tiun, Sabrina
    Omar, Nazlia
    AL-Aswadi, Fatima N.
    Alezabi, Kamal Ali
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 6635 - 6642
  • [4] Tashaphyne0.4: a new arabic light stemmer based on rhyzome modeling approach
    Al-Khatib, Ra'ed M.
    Zerrouki, Taha
    Abu Shquier, Mohammed M.
    Balla, Amar
    [J]. INFORMATION RETRIEVAL JOURNAL, 2023, 26 (1-2):
  • [5] Tashaphyne0.4: a new arabic light stemmer based on rhyzome modeling approach
    Ra’ed M. Al-Khatib
    Taha Zerrouki
    Mohammed M. Abu Shquier
    Amar Balla
    [J]. Information Retrieval Journal, 2023, 26
  • [6] Building an Effective Rule-Based Light Stemmer for Arabic Language to Improve Search Effectiveness
    Ababneh, Mohamad
    Al-Shalabi, Riyad
    Kanaan, Ghassan
    Al-Nobani, Alaa
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2012, 9 (04) : 368 - 372
  • [7] Building an Effective Rule-Based Light Stemmer for Arabic Language to Improve Search Effectiveness
    Kanaan, Ghassan
    Al-Shalabi, Riyad
    Ababneh, Mohamad
    Al-Nobani, Alaa
    [J]. IIT: 2008 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY, 2008, : 292 - +