UNSUPERVISED MODEL ADAPTATION FOR END-TO-END ASR

被引:1
|
作者
Sivaraman, Ganesh [1 ]
Casal, Ricardo [1 ]
Garland, Matt [1 ]
Khoury, Elie [1 ]
机构
[1] Pindrop, Atlanta, GA 30308 USA
关键词
End-to-end; speech recognition; unsupervised adaptation; confidence measure; call centers; telephony audio;
D O I
10.1109/ICASSP43922.2022.9746188
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
End-to-end (E2E) Automatic Speech Recognition (ASR) systems are widely applied in various devices and communication domains. However, state-of-the-art ASR systems are known to underperform when there is a mismatch in the training and test domains. As a result, acoustic models deployed in production are often adapted to the target domain to improve accuracy. This paper proposes a method to perform unsupervised model adaptation for E2E ASR using first-pass transcriptions of adaptation data produced by the baseline ASR model itself. The paper proposes two transcription confidence measures that can be used to select an optimal in-domain adaptation set. Experiments were performed using the Quartznet ASR architecture on the HarperValleyBank corpus. Results show that the unsupervised adaptation technique with the confidence measure based data selection results in a 8% absolute reduction in word error rate on the HarperValleyBank test set. The proposed method can be applied to any E2E ASR system and is suitable for model adaptation on call center audio with little to no manual transcription.
引用
收藏
页码:6987 / 6991
页数:5
相关论文
共 50 条
  • [1] UNSUPERVISED SPEAKER ADAPTATION USING ATTENTION-BASED SPEAKER MEMORY FOR END-TO-END ASR
    Sari, Leda
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7384 - 7388
  • [2] Auxiliary feature based adaptation of end-to-end ASR systems
    Delcroix, Marc
    Watanabe, Shinji
    Ogawa, Atsunori
    Karita, Shigeki
    Nakatani, Tomohiro
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2444 - 2448
  • [3] A BETTER AND FASTER END-TO-END MODEL FOR STREAMING ASR
    Li, Bo
    Gulati, Anmol
    Yu, Jiahui
    Sainath, Tara N.
    Chiu, Chung-Cheng
    Narayanan, Arun
    Chang, Shuo-Yiin
    Pang, Ruoming
    He, Yanzhang
    Qin, James
    Han, Wei
    Liang, Qiao
    Zhang, Yu
    Strohman, Trevor
    Wu, Yonghui
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5634 - 5638
  • [4] DOES SPEECH ENHANCEMENTWORK WITH END-TO-END ASR OBJECTIVES?: EXPERIMENTAL ANALYSIS OF MULTICHANNEL END-TO-END ASR
    Ochiai, Tsubasa
    Watanabe, Shinji
    Katagiri, Shigeru
    2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,
  • [5] Iterative Compression of End-to-End ASR Model using AutoML
    Mehrotra, Abhinav
    Dudziak, Lukasz
    Yeo, Jinsu
    Lee, Young-yoon
    Vipperla, Ravichander
    Abdelfattah, Mohamed S.
    Bhattacharya, Sourav
    Ishtiaq, Samin
    Ramos, Alberto Gil C. P.
    Lee, SangJeong
    Kim, Daehyun
    Lane, Nicholas D.
    INTERSPEECH 2020, 2020, : 3361 - 3365
  • [6] TWO-PASS END-TO-END ASR MODEL COMPRESSION
    Dawalatabad, Nauman
    Vatsal, Tushar
    Gupta, Ashutosh
    Kim, Sungsoo
    Singh, Shatrughan
    Gowda, Dhananjaya
    Kim, Chanwoo
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 403 - 410
  • [7] End-to-End Unsupervised Style and Resolution Transfer Adaptation Segmentation Model for Remote Sensing Images
    Li, Zhengwei
    Wang, Xili
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IV, 2024, 14428 : 380 - 393
  • [8] ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning
    Dudziak, Lukasz
    Abdelfattah, Mohamed S.
    Vipperla, Ravichander
    Laskaridis, Stefanos
    Lane, Nicholas D.
    INTERSPEECH 2019, 2019, : 2235 - 2239
  • [9] Speech Representation Learning for Emotion Recognition Using End-to-End ASR with Factorized Adaptation
    Yeh, Sung-Lin
    Lin, Yun-Shao
    Lee, Chi-Chun
    INTERSPEECH 2020, 2020, : 536 - 540
  • [10] Towards Lifelong Learning of End-to-end ASR
    Chang, Heng-Jui
    Lee, Hung-yi
    Lee, Lin-shan
    INTERSPEECH 2021, 2021, : 2551 - 2555