Lexical modeling for the development of Amharic automatic speech recognition systems

被引:0
|
作者
Tachbelie, Martha Yifiru [1 ]
Abate, Solomon Teferra [1 ]
机构
[1] Addis Ababa Univ, Sch Informat Sci, Addis Ababa, Ethiopia
关键词
Amharic; Lexical model; Under-resourced language; Automatic speech recognition; LANGUAGE; ASR;
D O I
10.1007/s10579-023-09659-y
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Amharic is the second most spoken Semitic language after Arabic. It has its own syllabary writing system, each character representing a consonant and a vowel. Automatic Speech Recognition (ASR) researches for Amharic have been conducted on the basis of grapheme-based pronunciation lexicon, taking advantage of the nature of its writing system. However, the epenthetic vowel and the glottal stop consonant represented in the writing system may not be pronounced in all of their occurrences. Moreover, the writing system does not differentiate geminated and non-geminated forms of consonants. Therefore, the grapheme-based pronunciation lexicon used so far has limitations with regard to these language features. To handle these limitations, we have prepared word- and morpheme-based pronunciation lexicons using data-driven and knowledge-driven experts' transcription. The data-driven transcription has been used for the preparation of training pronunciation lexicon while the knowledge-driven has been used to prepare morpheme- and word-based pronunciation lexicons for decoding. When morpheme-based knowledge-driven lexicons are used, better ASR performance (compared with the baseline ASR system that used grapheme-based lexicon) has been achieved although the number of phones is much more (60) than the number of phones used in the grapheme-based lexicon (37).
引用
收藏
页码:963 / 984
页数:22
相关论文
共 50 条
  • [21] Prosody modeling for automatic speech recognition and understanding
    Shriberg, E
    Stolcke, A
    [J]. MATHEMATICAL FOUNDATIONS OF SPEECH AND LANGUAGE PROCESSING, 2004, 138 : 105 - 114
  • [22] FEDERATED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION
    Cui, Xiaodong
    Lu, Songtao
    Kingsbury, Brian
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 6748 - 6752
  • [23] Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features
    Truong, Khict P.
    Raaijmakers, Stephan
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 161 - +
  • [24] Duration Modeling in Automatic Recited Speech Recognition
    Alotaibi, Yousef A.
    Yakoub, Mohammed Sidi
    Meftah, Ali
    Selouani, Sid-Ahmed
    [J]. 2016 39TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2016, : 323 - 326
  • [25] Sentence-Level Automatic Speech Segmentation for Amharic
    Tamiru, Rahel Mekonen
    Abate, Solomon Teferra
    [J]. PROCEEDINGS OF SIXTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICICT 2021), VOL 2, 2022, 236 : 477 - 485
  • [26] PRELIMINARY CONSIDERATIONS FOR AUTOMATIC SPEECH RECOGNITION SYSTEMS
    UNGEHEUER, G
    [J]. PHONETICA, 1979, 36 (4-5) : 254 - 262
  • [27] Transfer Learning for Automatic Speech Recognition Systems
    Asefisaray, Behnam
    Haznedaroglu, Ali
    Erden, Mustafa
    Arslan, Levent M.
    [J]. 2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [28] Validation of Speech Data for Training Automatic Speech Recognition Systems
    Krizaj, Janes
    Gros, Jerneja Zganec
    Dobrisek, Simon
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1165 - 1169
  • [29] TEXT NORMALIZATION FOR AUTOMATIC SPEECH RECOGNITION SYSTEMS
    Vasile, Alin-Florentin
    Boros, Tiberiu
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2016, : 121 - 128
  • [30] AUTOMATIC SPEECH RECOGNITION FOR REAL TIME SYSTEMS
    Singh, Ranjodh
    Yadav, Hemant
    Sharma, Mohit
    Gosain, Sandeep
    Shah, Rajiv Ratn
    [J]. 2019 IEEE FIFTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2019), 2019, : 189 - 198