Towards the automatic generation of Arabic Lexical Recognition Tests using orthographic and phonological similarity maps

被引:0
|
作者
Salah, Saeed [1 ]
Nassar, Mohammad [1 ]
Zaghal, Raid [1 ]
Hamed, Osama [2 ]
机构
[1] Al Quds Univ, Dept Comp Sci, IL-20002 Jerusalem, Israel
[2] Palestine Tech Univ, Comp Syst Engn Dept, Tulkarm, Palestine, Israel
关键词
NLP; LRT; N-gram; Dialects; MSA; Orthographic; Phonological; ENGLISH; CORPUS;
D O I
10.1016/j.jksuci.2021.02.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Lexical Recognition Test (LRT) themes are one of the main methods that are widely used to measure lan-guage proficiency of some common languages such as English, German and Spanish. However, similar research for Arabic is still at development stages, and existing proposals mainly use human-crafted meth-ods. In this paper, a new methodology, based on a newly developed algorithm, was proposed with the aim of automatically constructing high quality nonwords associated with a real quick measurement of Arabic proficiency levels (Arabic LRT). The suggested algorithm will automatically generate nonwords based on Arabic special characteristics they are orthography (spelling), phonology (pronunciation), n -grams and the word frequency map, which is an important factor to create a multi-level test. With the help of a large dataset of Arabic vocabulary, the proposed algorithm was experimented. For this purpose, a Web-based application, following the suggested methodology, was designed and implemented to facil-itate the process of collecting and analyzing learners' responses. The experimental results have shown that the LRT questions that were automatically generated by the proposed system had confused the learners, this is clear from the output of the confusion matrix which showed that (1/3) of the generated nonwords were able to distract the learners (with accuracy 65%). Consequentially, the results of recall and precision have smaller values, 0.52 and 0.48, respectively.(c) 2021 The Authors. Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:8429 / 8439
页数:11
相关论文
共 50 条
  • [1] Effect of orthographic and phonological similarity on false recognition of drug names
    Lambert, BL
    Chang, KY
    Lin, SJ
    SOCIAL SCIENCE & MEDICINE, 2001, 52 (12) : 1843 - 1857
  • [2] Lexical and Phonetic Modeling for Arabic Automatic Speech Recognition
    Nguyen, Long
    Ng, Tim
    Nguyen, Kham
    Zbib, Rabih
    Makhoul, John
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 708 - +
  • [3] The Role of Diacritics in Designing Lexical Recognition Tests for Arabic
    Hamed, Osama
    Zesch, Torsten
    ARABIC COMPUTATIONAL LINGUISTICS (ACLING 2017), 2017, 117 : 119 - 128
  • [4] Cognate facilitation in bilingual reading: The influence of orthographic and phonological similarity on lexical decisions and eye-movements
    Tiffin-Richards, Simon P.
    BILINGUALISM-LANGUAGE AND COGNITION, 2024,
  • [5] Automatic semantic maps generation from lexical annotations
    Rangel, Jose Carlos
    Cazorla, Miguel
    Garcia-Varea, Ismael
    Romero-Gonzalez, Cristina
    Martinez-Gomez, Jesus
    AUTONOMOUS ROBOTS, 2019, 43 (03) : 697 - 712
  • [6] Automatic semantic maps generation from lexical annotations
    José Carlos Rangel
    Miguel Cazorla
    Ismael García-Varea
    Cristina Romero-González
    Jesus Martínez-Gómez
    Autonomous Robots, 2019, 43 : 697 - 712
  • [7] PHOR-in-One: A multilingual lexical database with PHonological, ORthographic and PHonographic word similarity estimates in four languages
    Ana Santos Costa
    Montserrat Comesaña
    Ana Paula Soares
    Behavior Research Methods, 2023, 55 : 3699 - 3725
  • [8] PHOR-in-One: A multilingual lexical database with PHonological, ORthographic and PHonographic word similarity estimates in four languages
    Costa, Ana Santos
    Comesana, Montserrat
    Soares, Ana Paula
    BEHAVIOR RESEARCH METHODS, 2023, 55 (07) : 3699 - 3725
  • [9] Using lexical similarity in handwritten word recognition
    Park, J
    Govindaraju, V
    IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, PROCEEDINGS, VOL II, 2000, : 290 - 295
  • [10] Cross-language effects of phonological and orthographic similarity in cognate word recognition The role of language dominance
    Carrasco-Ortiz, Haydee
    Amengual, Mark
    Gries, Stefan Th
    LINGUISTIC APPROACHES TO BILINGUALISM, 2021, 11 (03) : 389 - 417