Modern Standard Arabic speech disorders corpus for digital speech processing applications

被引:0
|
作者
Alqudah A.A.M. [1 ,3 ]
Alshraideh M.A.M. [1 ]
Abushariah M.A.M. [2 ]
Sharieh A.A.S. [1 ]
机构
[1] Department of Computer Science, King Abdullah II School of Information Technology, The University of Jordan, Amman
[2] Department of Computer Information Systems, King Abdullah II School of Information Technology, The University of Jordan, Amman
[3] Department of Computer Science, Faculty of Science and Information Technology, Al-Zaytoonah University of Jordan, Amman
来源
Int J Speech Technol | 2024年 / 1卷 / 157-170期
关键词
Automatic speech recognition; CMU Pocketsphinx; HMM; LDA; MFCC; MLLT; Modern standard Arabic; Speech corpus; Speech disorders;
D O I
10.1007/s10772-024-10086-9
中图分类号
学科分类号
摘要
Digital speech processing applications including automatic speech recognition (ASR), speaker recognition, speech translation, and others, essentially require large volumes of speech data for training and testing purposes. Although there are available speech corpora, speech data for speakers suffering speech disorders are hardly available for many languages including Arabic language. Consequently, developing digital speech processing applications that target the entire society becomes hard due to the unavailability of speech corpora that contain sufficient speakers’ variations including healthy and disordered speech. This research presents our work towards developing a Modern Standard Arabic (MSA) speech corpus for speakers suffering distortion and substitution articulation disorders. The speech corpus was recorded by 40 (20 male and 20 female) Jordanian speakers who suffer either distortion or/and substitution articulation disorders. This speech corpus can be used for various applications including ASR, speech and hearing, and others. Part of this speech corpus is used for developing and evaluating an ASR for MSA using the Carnegie Mellon University (CMU) Pocketsphinx tools based on Mel-Frequency Cepstral Coefficients (MFCC) and Hidden Markov Model (HMM) techniques. Furthermore, Linear Discriminant Analysis (LDA) and Maximum Likelihood Linear Transform (MLLT) optimization techniques were applied. Using three different testing data sets, this work obtained 98.38% and 1.76% average word recognition correctness rate (WRCR) and average Word Error Rate (WER), respectively, for speaker-dependent and text-independent. For speaker-independent and text-dependent, this work obtained 99.37% and 0.68% average WRCR and average WER, respectively, whereas for speaker-independent and text-independent this work obtained 96.53% and 4.00% average WRCR and average WER, respectively. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
引用
收藏
页码:157 / 170
页数:13
相关论文
共 50 条
  • [31] QASR: QCRI aljazeera speech resource a large scale annotated Arabic speech corpus
    Mubarak, Hamdy
    Hussein, Amir
    Chowdhury, Shammur Absar
    Ali, Ahmed
    ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 2021, : 2274 - 2285
  • [32] Hate speech detection with ADHAR: a multi-dialectal hate speech corpus in Arabic
    Charfi, Anis
    Besghaier, Mabrouka
    Akasheh, Raghda
    Atalla, Andria
    Zaghouani, Wajdi
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [33] QASR: QCRI Aljazeera Speech Resource A Large Scale Annotated Arabic Speech Corpus
    Mubarak, Hamdy
    Hussein, Amir
    Chowdhury, Shammur Absar
    Ali, Ahmed
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2274 - 2285
  • [34] ALGERIAN ARABIC SPEECH DATABASE (ALGASD): CORPUS DESIGN AND AUTOMATIC SPEECH RECOGNITION APPLICATION
    Droua-Hamdani, Ghania
    Selouani, Sid Ahmed
    Boudraa, Malika
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2010, 35 (2C): : 157 - 166
  • [35] The effect of modern standard arabic orthography on speech production by Arab children with hearing loss
    Most, Tova
    Levin, Iris
    Sarsour, Marwa
    JOURNAL OF DEAF STUDIES AND DEAF EDUCATION, 2008, 13 (03): : 417 - 431
  • [36] ArabRecognizer: modern standard Arabic speech recognition inspired by DeepSpeech2 utilizing Franco-Arabic
    Nasef, Mohammed M.
    Elshall, Amr A.
    Sauber, Amr M.
    International Journal of Speech Technology, 2024, 27 (03) : 673 - 686
  • [37] HMM/SVM segmentation and labelling of Arabic speech for speech recognition applications
    Frihia H.
    Bahi H.
    International Journal of Speech Technology, 2017, 20 (3) : 563 - 573
  • [38] FarSpeech: Arabic Natural Language Processing for Live Arabic Speech
    Eldesouki, Mohamed
    Gopee, Naassih
    Ali, Ahmed
    Darwish, Kareem
    INTERSPEECH 2019, 2019, : 2372 - 2373
  • [39] Dataset of British English speech recordings for psychoacoustics and speech processing research: The clarity speech corpus
    Graetzer, Simone
    Akeroyd, Michael A.
    Barker, Jon
    Cox, Trevor J.
    Culling, John F.
    Naylor, Graham
    Porter, Eszter
    Viveros-Munoz, Rhoddy
    DATA IN BRIEF, 2022, 41
  • [40] Introduction to Digital Speech Processing
    Rabiner, Lawrence R.
    Schafer, Ronald W.
    FOUNDATIONS AND TRENDS IN SIGNAL PROCESSING, 2007, 1 (1-2): : 1 - 194