Modern Standard Arabic Speech Corpora: A Systematic Review

被引:2
|
作者
Alqadasi, Ammar Mohammed Ali [1 ,2 ]
Abdulghafor, Rawad [1 ]
Sunar, Mohd Shahrizal [3 ,4 ,5 ]
Salam, Md. Sah Bin H. J. [4 ]
机构
[1] Int Islamic Univ Malaysia, Fac Informat & Commun Technol, Natl Dept Comp Sci, Kuala Lumpur 53100, Malaysia
[2] Int Islamic Univ Malaysia, Fac Informat & Commun Technol, Natl Dept Comp Sci, Kuala Lumpur, Malaysia
[3] Arab Open Univ Oman, Fac Comp Studies FCS, Muscat 130, Oman
[4] Univ Teknol Malaysia, Fac Comp, Johor Baharu 81310, Malaysia
[5] Univ Teknol Malaysia, Inst Human Ctr Engn, Media & Game Innovat Ctr Excellence, Johor Baharu 81310, Malaysia
来源
IEEE ACCESS | 2023年 / 11卷
关键词
Databases; Speech processing; Standards; Speech recognition; Market research; Distributed databases; Systematics; Speech corpus; speech database; modern standard Arabic; MSA corpora; speech recognition; Arabic recognition; RECOGNITION SYSTEM; FEATURE-EXTRACTION; CORPUS; TEXT; CLASSIFICATION; RHYTHM; TRANSCRIPTION; IMPACT;
D O I
10.1109/ACCESS.2023.3282259
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech processing applications have become integral components across various domains of modern life. The design and preparation of a reliable recognition system rely heavily on the availability of suitable speech databases. While numerous speech databases exist for English and other languages, the availability of comprehensive resources for Arabic language remains limited. In light of this, we conducted a systematic review aiming to identify, analyse, and classify existing Modern Standard Arabic speech databases. Through our review, we identified 27 publicly available databases and analysed an additional 80 subjective databases. These databases were thoroughly studied, classified based on their characteristics, and subjected to a detailed analysis of research trends in the field. This paper provides a comprehensive discussion on the diverse speech databases developed for various speech processing applications. It sheds light on the purposes and unique characteristics of Arabic speech databases, enabling researchers to easily access suitable resources for their specific applications. The findings of this review contribute to bridging the gap in available Arabic speech databases and serve as a valuable resource for researchers in the field.
引用
收藏
页码:55771 / 55796
页数:26
相关论文
共 50 条
  • [1] Modern Standard Arabic Based Multilingual Approach for Dialectal Arabic Speech Recognition
    Elmahdy, Mohamed
    Gruhn, Rainer
    Minker, Wolfgang
    Abdennadher, Slim
    [J]. 2009 EIGHTH INTERNATIONAL SYMPOSIUM ON NATURAL LANGUAGE PROCESSING, PROCEEDINGS, 2009, : 169 - +
  • [2] Modern Standard Arabic speech disorders corpus for digital speech processing applications
    Alqudah A.A.M.
    Alshraideh M.A.M.
    Abushariah M.A.M.
    Sharieh A.A.S.
    [J]. International Journal of Speech Technology, 2024, 27 (1) : 157 - 170
  • [3] Synthesis of the intonation of neutrally spoken Modern Standard Arabic speech
    Ei-Imam, Yousif A.
    [J]. SIGNAL PROCESSING, 2008, 88 (09) : 2206 - 2221
  • [4] Colloquialising Modern Standard Arabic Text for Improved Speech Recognition
    Al-Shareef, Sarah
    Hain, Thomas
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1345 - 1349
  • [5] Multi Dialect Arabic Speech Parallel Corpora
    Almeman, Khalid
    Lee, Mark
    Almiman, Ali Abdulrahman
    [J]. 2013 FIRST INTERNATIONAL CONFERENCE ON COMMUNICATIONS SIGNAL PROCESSING, AND THEIR APPLICATIONS (ICCSPA'13), 2013,
  • [6] Arabic Automatic Speech Recognition: A Systematic Literature Review
    Dhouib, Amira
    Othman, Achraf
    El Ghoul, Oussama
    Khribi, Mohamed Koutheair
    Al Sinani, Aisha
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [7] Resources and benchmark corpora for hate speech detection: a systematic review
    Fabio Poletto
    Valerio Basile
    Manuela Sanguinetti
    Cristina Bosco
    Viviana Patti
    [J]. Language Resources and Evaluation, 2021, 55 : 477 - 523
  • [8] Resources and benchmark corpora for hate speech detection: a systematic review
    Poletto, Fabio
    Basile, Valerio
    Sanguinetti, Manuela
    Bosco, Cristina
    Patti, Viviana
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2021, 55 (02) : 477 - 523
  • [9] Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems
    Abushariah, Mohammad Abd-Alrahman Mahmoud
    Ainon, Raja Noor
    Zainuddin, Roziati
    Alqudah, Assal Ali Mustafa
    Ahmed, Moustafa Elshafei
    Khalifa, Othman Omran
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (07): : 2215 - 2242
  • [10] Towards including prosody in a text-to-speech system for modern standard Arabic
    Ramsay, Allan
    Mansour, Hanady
    [J]. COMPUTER SPEECH AND LANGUAGE, 2008, 22 (01): : 84 - 103