A Rule-Based Approach for Tagging Non-Vocalized Arabic Words

被引:0
|
作者
Al-Taani, Ahmad [1 ]
Abu Al-Rub, Salah [1 ]
机构
[1] Yarmouk Univ, Dept Comp Sci, Irbid, Jordan
关键词
Part-of-speech tagging; lexical analyzer; morphological analyzer; Arabic language processing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we present a tagging system which classifies the words in a non-vocalized Arabic text to their tags. The proposed tagging system passes through three levels analysis. The first level is a lexical analyzer that composed of a lexicon containing all fixed words and particles such as prepositions and pronouns. The second level is a morphological analyzer which relies on word structure using, patterns and affixes to determine word class. The third level is syntax analyzer or a grammatical tagging which relies on the process of assigning grammatical tags to words based on their context or the position of the word in the sentence. The syntax analyzer level consists of two stages: the first stage depends on specific keywords that inform the tag of the successive word, the second stage is the reversed parsing technique which scans the available, grammars of Arabic language to get the class of a single ambiguity word in the sentence. We have tested the proposed system on a corpus consists of 2355 words. Experimental results showed that the proposed system achieved a rate of success approaching 94% of the total number of words in the sample used in the study.
引用
收藏
页码:320 / 328
页数:9
相关论文
共 50 条
  • [1] Tagging Icelandic text: A linguistic rule-based approach
    Loftsson, Hrafn
    [J]. NORDIC JOURNAL OF LINGUISTICS, 2008, 31 (01) : 47 - 72
  • [3] A Rule-based Approach for Arabic Temporal Expression Extraction
    Lhioui, Chahira
    Zouaghi, Anis
    Zrigui, Mounir
    [J]. 2017 INTERNATIONAL CONFERENCE ON ENGINEERING & MIS (ICEMIS), 2017,
  • [4] Moroccan Arabic vocabulary generation using a rule-based approach
    Tachicart, Ridouane
    Bouzoubaa, Karim
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) : 8538 - 8548
  • [5] A Rule-Based Approach to Identify Stop Words for Gujarati Language
    Rakholia, Rajnish M.
    Saini, Jatinderkumar R.
    [J]. PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, FICTA 2016, VOL 1, 2017, 515 : 797 - 806
  • [6] Tagging medical texts: a rule-based experiment
    Ruch, P
    Bouillon, P
    Robert, G
    Baud, R
    Rassinoux, AM
    [J]. MEDICAL INFOBAHN FOR EUROPE, PROCEEDINGS, 2000, 77 : 448 - 455
  • [7] Rule Based Approach for Arabic Part of Speech Tagging and Name Entity Recognition
    Btoush, Mohammad Hjouj
    Alarabeyyat, Abdulsalam
    Olab, Isa
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (06) : 331 - 335
  • [8] A Rule-based Method for Arabic Question Classification
    Lahbari, Imane
    El Alaoui Ouatik, Said
    Alaoui Zidani, Khalid
    [J]. 2017 INTERNATIONAL CONFERENCE ON WIRELESS NETWORKS AND MOBILE COMMUNICATIONS (WINCOM), 2017, : 390 - 395
  • [9] A rule-based stemmer for Arabic Gulf dialect
    Abuata, Belal
    Al-Omari, Asma
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2015, 27 (02) : 104 - 112
  • [10] A Rule-Based Subject-Correlated Arabic Stemmer
    Mahmoud El-Defrawy
    Yasser El-Sonbaty
    Nahla A. Belal
    [J]. Arabian Journal for Science and Engineering, 2016, 41 : 2883 - 2891