Deep Neural Network Calibration for E2E Speech Recognition System

被引:1
|
作者
Lee, Mun-Hak [1 ]
Chang, Joon-Hyuk [1 ]
机构
[1] Hanyang Univ, Dept Elect Engn, Seoul, South Korea
来源
关键词
E2E speech recognition; deep neural network calibration;
D O I
10.21437/Interspeech.2021-176
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Cross-entropy loss, which is commonly used in deep-neural-network-based (DNN) classification model training, induces models to assign a high probability value to one class. Networks trained in this fashion tend to be overconfident, which causes a problem in the decoding process of the speech recognition system, as it uses the combined probability distribution of multiple independently trained networks. Overconfidence in neural networks can be quantified as a calibration error, which is the difference between the output probability of a model and the likelihood of obtaining an actual correct answer. We show that the deep-learning-based components of an end-to-end (E2E) speech recognition system with high classification accuracy contain calibration errors and quantify them using various calibration measures. In addition, it was experimentally shown that the calibration function, which was being trained to minimize calibration errors effectively mitigates those of the speech recognition system, and as a result, can improve the performance of beam-search during decoding.
引用
收藏
页码:4064 / 4068
页数:5
相关论文
共 50 条
  • [1] Label-Synchronous Neural Transducer for Adaptable Online E2E Speech Recognition
    Deng, Keqi
    Woodland, Philip C.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3507 - 3516
  • [2] Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
    Yuan Shangguan
    Prabhavalkar, Rohit
    Hang Su
    Mahadeokar, Jay
    Shi, Yangyang
    Zhou, Jiatong
    Wu, Chunyang
    Duc Le
    Kalinli, Ozlem
    Fuegen, Christian
    Seltzer, Michael L.
    [J]. INTERSPEECH 2021, 2021, : 4553 - 4557
  • [3] Measurement System Architecture for Measuring Network Parameters of e2e Services
    Kulik, Vyacheslav
    Kirichek, Ruslan
    Borodin, Alexey
    Koucheryavy, Andrey
    [J]. DISTRIBUTED COMPUTER AND COMMUNICATION NETWORKS (DCCN 2017), 2017, 700 : 291 - 306
  • [4] Dynamic Resource Provisioning of a Scalable E2E Network Slicing Orchestration System
    Afolabi, Ibrahim
    Prados-Garzon, Jonathan
    Bagaa, Miloud
    Taleb, Tarik
    Ameigeiras, Pablo
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2020, 19 (11) : 2594 - 2608
  • [5] Few-shot learning for E2E speech recognition: architectural variants for support set generation
    Eledath, Dhanya
    Thurlapati, Narasimha Rao
    Pavithra, V
    Banerjee, Tirthankar
    Ramasubramanian, V
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 444 - 448
  • [6] Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition
    Ma, Guodong
    Hu, Pengfei
    Kang, Jian
    Huang, Shen
    Huang, Hao
    [J]. INTERSPEECH 2021, 2021, : 306 - 310
  • [7] INTERNAL LANGUAGE MODEL PERSONALIZATION OF E2E AUTOMATIC SPEECH RECOGNITION USING RANDOM ENCODER FEATURES
    Stooke, Adam
    Sim, Khe Chai
    Chua, Mason
    Munkhdalai, Tsendsuren
    Strohman, Trevor
    [J]. 2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 213 - 220
  • [8] DIRECTIONAL ASR: A NEW PARADIGM FOR E2E MULTI-SPEAKER SPEECH RECOGNITION WITH SOURCE LOCALIZATION
    Subramanian, Aswin Shanmugam
    Weng, Chao
    Watanabe, Shinji
    Yu, Meng
    Xu, Yong
    Zhang, Shi-Xiong
    Yu, Dong
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 8433 - 8437
  • [9] HAVE BEST OF BOTH WORLDS: TWO-PASS HYBRID AND E2E CASCADING FRAMEWORK FOR SPEECH RECOGNITION
    Ye, Guoli
    Mazalov, Vadim
    Li, Jinyu
    Gong, Yifan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7432 - 7436
  • [10] E2E Verifiable Electronic Voting System for Shareholders
    Bag, Samiran
    Hao, Feng
    [J]. 2019 IEEE CONFERENCE ON DEPENDABLE AND SECURE COMPUTING (DSC), 2019, : 118 - 125