INTEGRATED PRONUNCIATION LEARNING FOR AUTOMATIC SPEECH RECOGNITION USING PROBABILISTIC LEXICAL MODELING

被引:0
|
作者
Rasipuram, Ramya [1 ]
Razavi, Marzieh [1 ,2 ]
Magimai-Doss, Mathew [1 ]
机构
[1] Idiap Res Inst, CH-1920 Martigny, Switzerland
[2] Ecole Polytech Fed Lausanne, CH-1015 Lausanne, Switzerland
关键词
Probabilistic lexical modeling; pronunciation lexicon; grapheme subwords; phoneme subwords; grapheme-to-phoneme conversion;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Standard automatic speech recognition (ASR) systems use phoneme-based pronunciation lexicon prepared by linguistic experts. When the hand crafted pronunciations fail to cover the vocabulary of a new domain, a grapheme-to-phoneme (G2P) converter is used to extract pronunciations for new words and then a phoneme-based ASR system is trained. G2P converters are typically trained only on the existing lexicons. In this paper, we propose a grapheme-based ASR approach in the framework of probabilistic lexical modeling that integrates pronunciation learning as a stage in ASR system training, and exploits both acoustic and lexical resources (not necessarily from the domain or language of interest). The proposed approach is evaluated on four lexical resource constrained ASR tasks and compared with the conventional two stage approach where G2P training is followed by ASR system development.
引用
收藏
页码:5176 / 5180
页数:5
相关论文
共 50 条
  • [21] Pronunciation modeling for spontaneous speech recognition using latent pronunciation analysis (LPA) and prior knowledge
    Lin, Che-Kuang
    Lee, Lin-Shan
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 673 - +
  • [22] Pronunciation change in conversational speech and its implications for automatic speech recognition
    Saraçlar, M
    Khudanpur, S
    [J]. COMPUTER SPEECH AND LANGUAGE, 2004, 18 (04): : 375 - 395
  • [23] Automatic Recognition of Spontaneous Emotions in Speech Using Acoustic and Lexical Features
    Truong, Khict P.
    Raaijmakers, Stephan
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, PROCEEDINGS, 2008, 5237 : 161 - +
  • [24] English Speech Recognition and Evaluation of Pronunciation Quality Using Deep Learning
    Xu, Yushu
    [J]. MOBILE INFORMATION SYSTEMS, 2022, 2022
  • [25] Computer Aided Pronunciation Learning System Using Speech Recognition Techniques
    Abdou, Sherif Mahdy
    Hamid, Salah Eldeen
    Rashwan, Mohsen
    Samir, Abdurrahman
    Abd-Elhamid, Ossama
    Shahin, Mostafa
    Nazih, Waleed
    [J]. INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 849 - +
  • [26] Automatic Pronunciation Assessment using Self-Supervised Speech Representation Learning
    Kim, Eesung
    Jeon, Jae-Jin
    Seo, Hyeji
    Kim, Hoon
    [J]. INTERSPEECH 2022, 2022, : 1411 - 1415
  • [27] AUTOMATIC EVALUATION OF ENGLISH PRONUNCIATION BASED ON SPEECH RECOGNITION TECHNIQUES
    HAMADA, H
    MIKI, S
    NAKATSU, R
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1993, E76D (03) : 352 - 359
  • [28] Automatic Speech Segmentation Using Probabilistic Latent Component Modeling
    Ghosh, Sayan
    Sreenivas, T. V.
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2259 - 2262
  • [29] Automatic speech recognition: Reliability and pedagogical implications for teaching pronunciation
    Kim, IS
    [J]. EDUCATIONAL TECHNOLOGY & SOCIETY, 2006, 9 (01): : 322 - 334
  • [30] Improving English Pronunciation via Automatic Speech Recognition Technology
    Li, Meihui
    Han, Meiting
    Chen, Zejia
    Mo, Yiling
    Chen, Xiujuan
    Liu, Xiaobin
    [J]. 2017 INTERNATIONAL SYMPOSIUM ON EDUCATIONAL TECHNOLOGY (ISET 2017), 2017, : 224 - 228