ALGERIAN ARABIC SPEECH DATABASE (ALGASD): CORPUS DESIGN AND AUTOMATIC SPEECH RECOGNITION APPLICATION

被引:0
|
作者
Droua-Hamdani, Ghania [1 ]
Selouani, Sid Ahmed [2 ]
Boudraa, Malika [3 ]
机构
[1] CRSTDLA, Speech Proc Lab, Algiers, Algeria
[2] Univ Moncton, LARIHS Lab, Moncton, NB E1A 3E9, Canada
[3] USTHB Univ, Speech Commun Lab, Algiers, Algeria
来源
关键词
speech corpus; Algerian speakers; modern standard Arabic; automatic speech recognition; hidden Markov models;
D O I
暂无
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents the Algerian Arabic Speech Database (ALGASD), a Modern Standard Arabic (MSA) speech corpus composed of utterances pronounced by 300 Algerian native speakers selected from eleven regions of Algeria. One of the objectives of this corpus design is to be representative of the regional accents of MSA spoken in Algeria. Useful information related to the speakers, such as gender, age, and education level, is provided. This paper also reports the results of the Automatic Speech Recognition (ASR) application of the corpus and outlines an original global monophone recognition model designed to handle linguistic variability. The global phone recognition rate for this ASR reference system is satisfactory and may constitute a useful baseline ASR system dedicated to MSA.
引用
收藏
页码:157 / 166
页数:10
相关论文
共 50 条
  • [1] Arabic corpus Implementation: Application to Speech Recognition
    Helali, Wafa
    Hajaiej, Zied
    Cherif, Adnane
    2018 INTERNATIONAL CONFERENCE ON ADVANCED SYSTEMS AND ELECTRICAL TECHNOLOGIES (IC_ASET), 2017, : 50 - 53
  • [2] Corpus for automatic speech recognition
    Adda-Decker, Martine
    REVUE FRANCAISE DE LINGUISTIQUE APPLIQUEE, 2007, 12 (01): : 71 - 84
  • [3] Modern standard Arabic speech corpus for implementing and evaluating automatic continuous speech recognition systems
    Abushariah, Mohammad Abd-Alrahman Mahmoud
    Ainon, Raja Noor
    Zainuddin, Roziati
    Alqudah, Assal Ali Mustafa
    Ahmed, Moustafa Elshafei
    Khalifa, Othman Omran
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2012, 349 (07): : 2215 - 2242
  • [4] Creation of Marathi Speech Corpus for Automatic Speech Recognition
    Gaikwad, Santosh
    Gawali, Bharti
    Mehrotra, Suresh
    2013 INTERNATIONAL CONFERENCE ORIENTAL COCOSDA HELD JOINTLY WITH 2013 CONFERENCE ON ASIAN SPOKEN LANGUAGE RESEARCH AND EVALUATION (O-COCOSDA/CASLRE), 2013,
  • [5] Automatic recognition of Arabic dysarthric speech
    Tolba, Hesham M.
    El-Torgoman, Ahmed S.
    AEJ - Alexandria Engineering Journal, 2010, 49 (02): : 131 - 138
  • [6] Arabic Automatic Speech Recognition Enhancement
    Ahmed, Basem H. A.
    Ghabayen, Ayman S.
    2017 PALESTINIAN INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (PICICT), 2017, : 98 - 102
  • [7] Arabic automatic segmentation system and its application for arabic speech recognition system
    Nofal, M
    Abdel-Raheem, E
    Kader, NSA
    Proceedings of the 46th IEEE International Midwest Symposium on Circuits & Systems, Vols 1-3, 2003, : 697 - 700
  • [8] The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition
    Mukiibi, Jonathan
    Katumba, Andrew
    Nakatumba-Nabende, Joyce
    Hussein, Ali
    Meyer, Josh
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 1945 - 1954
  • [9] Chhattisgarhi speech corpus for research and development in automatic speech recognition
    Londhe, Narendra D.
    Kshirsagar, Ghanahshyam B.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (02) : 193 - 210
  • [10] RSC: A Romanian Read Speech Corpus for Automatic Speech Recognition
    Georgescu, Alexandru-Lucian
    Cucu, Horia
    Buzo, Andi
    Burileanu, Corneliu
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6606 - 6612