Improved Arabic speech recognition system through the automatic generation of fine-grained phonetic transcriptions

被引:12
|
作者
Alsharhan, Eiman [1 ]
Ramsay, Allan [2 ]
机构
[1] Kuwait Univ, Kuwait, Kuwait
[2] Univ Manchester, Manchester, Lancs, England
关键词
D O I
10.1016/j.ipm.2017.07.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper aims at determining the best way to exploit the phonological properties of the Arabic language in order to improve the performance of the speech recognition system. One of the main challenges facing the processing of Arabic is the effect of the local context, which induces changes in the phonetic representation of a given text, thereby causing the recognition engine to misclassify it. The proposed solution is to develop a set of language-dependent grapheme-to-allophone rules that can predict such allophonic variations and hence provide a phonetic transcription that is sensitive to the local context for the automatic speech recognition system. The novel aspect of this method is that the pronunciation of each word is extracted directly from a context-sensitive phonetic transcription rather than a predefined dictionary that typically does not reflect the actual pronunciation of the word. The paper also aims at employing the stress feature as one of the supra-segmental characteristics of speech to enhance the acoustic modelling. The effectiveness of applying the proposed rules has been tested by comparing the performance of a dictionary based system against one using the automatically generated phonetic transcription. The research reported an average of 9.3% improvement in the system's performance by eliminating the fixed dictionary and using the generated phonetic transcription to learn the phone probabilities. Marking the stressed vowels with separate stress markers leads to a further improvement of 1.7%. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:343 / 353
页数:11
相关论文
共 50 条
  • [1] Validation of phonetic transcriptions in the context of automatic speech recognition
    Christophe Van Bael
    Henk van den Heuvel
    Helmer Strik
    Language Resources and Evaluation, 2007, 41 : 129 - 146
  • [2] Validation of phonetic transcriptions in the context of automatic speech recognition
    Van Bael, Christophe
    van den Heuvel, Henk
    Strik, Helmer
    LANGUAGE RESOURCES AND EVALUATION, 2007, 41 (02) : 129 - 146
  • [3] Generation of Arabic Phonetic Dictionaries for Speech Recognition
    Ali, Mohamed
    Elshafei, Moustafa
    Al-Ghamdi, Mansour
    Al-Muhtaseb, Husni
    Al-Najjar, Atef
    IIT: 2008 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION TECHNOLOGY, 2008, : 434 - +
  • [4] Lexical and Phonetic Modeling for Arabic Automatic Speech Recognition
    Nguyen, Long
    Ng, Tim
    Nguyen, Kham
    Zbib, Rabih
    Makhoul, John
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 708 - +
  • [5] Fine-Grained Grounding for Multimodal Speech Recognition
    Srinivasan, Tejas
    Sanabria, Ramon
    Metze, Florian
    Elliott, Desmond
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2667 - 2677
  • [6] Fine-grained Automatic Augmentation for handwritten character recognition
    Chen, Wei
    Su, Xiangdong
    Hou, Hongxu
    PATTERN RECOGNITION, 2025, 159
  • [7] Improving Speech Enhancement through Fine-Grained Speech Characteristics
    Yang, Muqiao
    Konan, Joseph
    Bick, David
    Kumar, Anurag
    Watanabe, Shinji
    Raj, Bhiksha
    INTERSPEECH 2022, 2022, : 2953 - 2957
  • [8] Efficient Fine-Grained Automatic Target Recognition through Active Learning for Defense Applications
    Thorp, Claire A.
    Sisti, Sean P.
    Browne, Lesrene A.
    Schwartz, Casey
    Inkawhich, Nathan
    Bennette, Walter
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS VI, 2024, 13051
  • [9] An Automatic Teacher Activity Recognition System based on Speech Transcriptions
    Uribe, Pablo
    Schlotterbeck, Danner
    Jimenez, Abelino
    Araya, Roberto
    Caballero, Daniela
    Van der Molen Moris, Johan
    2021 XVI LATIN AMERICAN CONFERENCE ON LEARNING TECHNOLOGIES (LACLO 2021), 2021, : 216 - 223
  • [10] Development of a phonetic system for large vocabulary Arabic speech recognition
    Gales, M. J. F.
    Diehl, F.
    Raut, C. K.
    Tomalin, M.
    Woodland, P. C.
    Yu, K.
    2007 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, VOLS 1 AND 2, 2007, : 24 - 29