ALGERIAN ARABIC SPEECH DATABASE (ALGASD): CORPUS DESIGN AND AUTOMATIC SPEECH RECOGNITION APPLICATION

被引:0
|
作者
Droua-Hamdani, Ghania [1 ]
Selouani, Sid Ahmed [2 ]
Boudraa, Malika [3 ]
机构
[1] CRSTDLA, Speech Proc Lab, Algiers, Algeria
[2] Univ Moncton, LARIHS Lab, Moncton, NB E1A 3E9, Canada
[3] USTHB Univ, Speech Commun Lab, Algiers, Algeria
来源
关键词
speech corpus; Algerian speakers; modern standard Arabic; automatic speech recognition; hidden Markov models;
D O I
暂无
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper presents the Algerian Arabic Speech Database (ALGASD), a Modern Standard Arabic (MSA) speech corpus composed of utterances pronounced by 300 Algerian native speakers selected from eleven regions of Algeria. One of the objectives of this corpus design is to be representative of the regional accents of MSA spoken in Algeria. Useful information related to the speakers, such as gender, age, and education level, is provided. This paper also reports the results of the Automatic Speech Recognition (ASR) application of the corpus and outlines an original global monophone recognition model designed to handle linguistic variability. The global phone recognition rate for this ASR reference system is satisfactory and may constitute a useful baseline ASR system dedicated to MSA.
引用
收藏
页码:157 / 166
页数:10
相关论文
共 50 条
  • [21] Experiments on Automatic Recognition of Nonnative Arabic Speech
    YousefAjami Alotaibi
    Sid-Ahmed Selouani
    Douglas O'Shaughnessy
    EURASIP Journal on Audio, Speech, and Music Processing, 2008
  • [22] A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
    Masmoudi, Abir
    Khemakhem, Mariem Ellouze
    Esteve, Yannick
    Belguith, Lamia Hadrich
    Habash, Nizar
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [23] Impact of a Newly Developed Modern Standard Arabic Speech Corpus on Implementing and Evaluating Automatic Continuous Speech Recognition Systems
    Abushariah, Mohammad A. M.
    Ainon, Raja N.
    Zainuddin, Roziati
    Al-Qatab, Bassam A.
    Alqudah, Assal A. M.
    SPOKEN DIALOGUE SYSTEMS FOR AMBIENT ENVIRONMENTS, 2010, 6392 : 1 - 12
  • [24] Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
    Alotaibi, Yousef Ajami
    Alghamdi, Mansour
    Alotaiby, Fabad
    IMAGE AND SIGNAL PROCESSING, PROCEEDINGS, 2010, 6134 : 122 - +
  • [25] Multimodal English corpus for automatic speech recognition
    Kunka, Bartosz
    Kupryjanow, Adam
    Dalka, Piotr
    Bratoszewski, Piotr
    Szczodrak, Maciej
    Spaleniak, Pawel
    Szykulski, Marcin
    Czyzewski, Andrzej
    2013 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2013, : 106 - 111
  • [26] CEASR: A Corpus for Evaluating Automatic Speech Recognition
    Ulasik, Malgorzata Anna
    Huerlimann, Manuela
    Germann, Fabian
    Gedik, Esin
    Benites, Fernando
    Cieliebak, Mark
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 6477 - 6485
  • [27] Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights
    Adiga, Devaraja
    Kumar, Rishabh
    Krishna, Amrith
    Jyothi, Preethi
    Ramakrishnan, Ganesh
    Goyal, Pawan
    Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, 2021, : 5039 - 5050
  • [28] AN APPLICATION OF AUTOMATIC SPEECH RECOGNITION
    HENTHORN, KS
    MACCORMACK, PJ
    JOURNAL OF MICROCOMPUTER APPLICATIONS, 1982, 5 (03): : 239 - 245
  • [29] Towards a Continuous Speech Corpus for Banking Domain Automatic Speech Recognition
    Suciu, George
    Toma, Stefan-Adrian
    Cheyeresan, Romulus
    2017 INTERNATIONAL CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN-COMPUTER DIALOGUE (SPED), 2017,
  • [30] Automatic Speech Recognition in Sanskrit: A New Speech Corpus and Modelling Insights
    Adiga, Devaraja
    Kumar, Rishabh
    Krishna, Amrith
    Jyothi, Preethi
    Ramakrishnan, Ganesh
    Goyal, Pawan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 5039 - 5050