Towards the automatic generation of Arabic Lexical Recognition Tests using orthographic and phonological similarity maps

被引:0
|
作者
Salah, Saeed [1 ]
Nassar, Mohammad [1 ]
Zaghal, Raid [1 ]
Hamed, Osama [2 ]
机构
[1] Al Quds Univ, Dept Comp Sci, IL-20002 Jerusalem, Israel
[2] Palestine Tech Univ, Comp Syst Engn Dept, Tulkarm, Palestine, Israel
关键词
NLP; LRT; N-gram; Dialects; MSA; Orthographic; Phonological; ENGLISH; CORPUS;
D O I
10.1016/j.jksuci.2021.02.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Lexical Recognition Test (LRT) themes are one of the main methods that are widely used to measure lan-guage proficiency of some common languages such as English, German and Spanish. However, similar research for Arabic is still at development stages, and existing proposals mainly use human-crafted meth-ods. In this paper, a new methodology, based on a newly developed algorithm, was proposed with the aim of automatically constructing high quality nonwords associated with a real quick measurement of Arabic proficiency levels (Arabic LRT). The suggested algorithm will automatically generate nonwords based on Arabic special characteristics they are orthography (spelling), phonology (pronunciation), n -grams and the word frequency map, which is an important factor to create a multi-level test. With the help of a large dataset of Arabic vocabulary, the proposed algorithm was experimented. For this purpose, a Web-based application, following the suggested methodology, was designed and implemented to facil-itate the process of collecting and analyzing learners' responses. The experimental results have shown that the LRT questions that were automatically generated by the proposed system had confused the learners, this is clear from the output of the confusion matrix which showed that (1/3) of the generated nonwords were able to distract the learners (with accuracy 65%). Consequentially, the results of recall and precision have smaller values, 0.52 and 0.48, respectively.(c) 2021 The Authors. Published by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
引用
收藏
页码:8429 / 8439
页数:11
相关论文
共 50 条
  • [41] Collecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose
    El-Sakhawy, Dayna
    Abdennadher, Slim
    Hamed, Injy
    MULTIMODAL ANALYSES ENABLING ARTIFICIAL AGENTS IN HUMAN-MACHINE INTERACTION, 2015, 8757 : 99 - 108
  • [42] Automatic Note Recognition and Generation of MDL and MML using FFT
    Li, Hanchao
    You, Hongyu
    Fei, Xiang
    Yang, Ming
    Chao, Kuo-Ming
    He, Chaobo
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2018), 2018, : 195 - 200
  • [43] Automatic recognition of abnormal cells in cytological tests using multispectral imaging
    Gertych, A.
    Galliano, G.
    Bose, S.
    Farkas, D. L.
    MEDICAL IMAGING 2010: COMPUTER - AIDED DIAGNOSIS, 2010, 7624
  • [44] Towards Automatic Poetry Generation using Constraint Handling Rules
    el Bolock, Alia
    Abdennadher, Slim
    30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II, 2015, : 1868 - 1873
  • [45] Towards Automatic StarCraft Strategy Generation Using Genetic Programming
    Garcia-Sanchez, Pablo
    Tonda, Alberto
    Mora, Antonio M.
    Squillero, Giovanni
    Merelo, J. J.
    2015 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG), 2015, : 284 - 291
  • [46] Towards Automatic Generation of Test Data using Branch Coverage
    Chen, Jifeng
    Yang, Luming
    ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 921 - 925
  • [47] Text independent automatic speaker recognition using self-organizing maps
    Mafra, AT
    Simoes, MG
    CONFERENCE RECORD OF THE 2004 IEEE INDUSTRY APPLICATIONS CONFERENCE, VOLS 1-4: COVERING THEORY TO PRACTICE, 2004, : 1503 - 1510
  • [48] Automatic Chinese Multiple Choice Question Generation Using Mixed Similarity Strategy
    Liu, Ming
    Rus, Vasile
    Liu, Li
    IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2018, 11 (02): : 193 - 202
  • [49] Automatic Whitelist Generation for SQL Queries Using Web Application Tests
    Nomura, Komei
    Rikitake, Kenji
    Matsumoto, Ryosuke
    2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 2, 2019, : 465 - 470
  • [50] Towards automatic content tagging - Enhanced web services in digital libraries using lexical chaining
    Waltinger, Uili
    Mehler, Alexander
    Heyer, Gerhard
    WEBIST 2008: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 2, 2008, : 231 - 236