Modern Standard Arabic speech disorders corpus for digital speech processing applications

被引:0
|
作者
Alqudah A.A.M. [1 ,3 ]
Alshraideh M.A.M. [1 ]
Abushariah M.A.M. [2 ]
Sharieh A.A.S. [1 ]
机构
[1] Department of Computer Science, King Abdullah II School of Information Technology, The University of Jordan, Amman
[2] Department of Computer Information Systems, King Abdullah II School of Information Technology, The University of Jordan, Amman
[3] Department of Computer Science, Faculty of Science and Information Technology, Al-Zaytoonah University of Jordan, Amman
来源
Int J Speech Technol | 2024年 / 1卷 / 157-170期
关键词
Automatic speech recognition; CMU Pocketsphinx; HMM; LDA; MFCC; MLLT; Modern standard Arabic; Speech corpus; Speech disorders;
D O I
10.1007/s10772-024-10086-9
中图分类号
学科分类号
摘要
Digital speech processing applications including automatic speech recognition (ASR), speaker recognition, speech translation, and others, essentially require large volumes of speech data for training and testing purposes. Although there are available speech corpora, speech data for speakers suffering speech disorders are hardly available for many languages including Arabic language. Consequently, developing digital speech processing applications that target the entire society becomes hard due to the unavailability of speech corpora that contain sufficient speakers’ variations including healthy and disordered speech. This research presents our work towards developing a Modern Standard Arabic (MSA) speech corpus for speakers suffering distortion and substitution articulation disorders. The speech corpus was recorded by 40 (20 male and 20 female) Jordanian speakers who suffer either distortion or/and substitution articulation disorders. This speech corpus can be used for various applications including ASR, speech and hearing, and others. Part of this speech corpus is used for developing and evaluating an ASR for MSA using the Carnegie Mellon University (CMU) Pocketsphinx tools based on Mel-Frequency Cepstral Coefficients (MFCC) and Hidden Markov Model (HMM) techniques. Furthermore, Linear Discriminant Analysis (LDA) and Maximum Likelihood Linear Transform (MLLT) optimization techniques were applied. Using three different testing data sets, this work obtained 98.38% and 1.76% average word recognition correctness rate (WRCR) and average Word Error Rate (WER), respectively, for speaker-dependent and text-independent. For speaker-independent and text-dependent, this work obtained 99.37% and 0.68% average WRCR and average WER, respectively, whereas for speaker-independent and text-independent this work obtained 96.53% and 4.00% average WRCR and average WER, respectively. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
引用
收藏
页码:157 / 170
页数:13
相关论文
共 50 条
  • [21] A Review on Speech Disorders and Processing of Disordered Speech
    Anthony, Audre Arlene
    Patil, Chandreshekar Mohan
    Basavaiah, Jagadeesh
    WIRELESS PERSONAL COMMUNICATIONS, 2022, 126 (02) : 1621 - 1631
  • [22] A Leveled Reading Corpus of Modern Standard Arabic
    Al Khalil, Muhamed
    Saddiki, Hind
    Habash, Nizar
    Alfalasi, Latifa
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2317 - 2321
  • [23] Open-Source Boundary-Annotated Corpus for Arabic Speech and Language Processing
    Brierley, Claire
    Sawalha, Majdi
    Atwell, Eric
    LREC 2012 - EIGHTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2012, : 1011 - 1016
  • [24] SOME APPLICATIONS OF A SMALL DIGITAL COMPUTER IN SPEECH PROCESSING
    SUEN, CY
    BEDDOES, MP
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 50 (01): : 107 - &
  • [25] THE AUSTRALIAN ENGLISH SPEECH CORPUS FOR IN-CAR SPEECH PROCESSING
    Kleinschmidt, Tristan
    Mason, Michael
    Wong, Eddie
    Sridharan, Sridha
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4177 - 4180
  • [26] Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
    Alotaibi, Yousef Ajami
    Alghamdi, Mansour
    Alotaiby, Fabad
    IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 122 - +
  • [27] An Enhanced Twitter Corpus for the Classification of Arabic Speech Acts
    Ahed, Majdi
    Hammo, Bassam H.
    Abushariah, Mohammad A. M.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (03) : 207 - 215
  • [28] A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
    Masmoudi, Abir
    Khemakhem, Mariem Ellouze
    Esteve, Yannick
    Belguith, Lamia Hadrich
    Habash, Nizar
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [29] Designing, Building, and Analyzing an Arabic Speech Emotional Corpus
    Meftah, Ali
    Alotaibi, Yousef
    Selouani, Sid-Ahmed
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [30] Perception of Standard Arabic Synthetic Speech Rate
    Aldholmi, Yahya
    Aldhafyan, Rawan
    Alqahtani, Asma
    INTERSPEECH 2021, 2021, : 1704 - 1707