AN EXTENDED EXPERIMENTAL INVESTIGATION OF DNN UNCERTAINTY PROPAGATION FOR NOISE ROBUST ASR

被引:0
|
作者
Nathwani, Karan [1 ,2 ,3 ]
Morales-Cordovilla, Juan A. [4 ]
Sivasankaran, Sunit [1 ,2 ,3 ]
Illina, Irina [1 ,2 ,3 ]
Vincent, Emmanuel [1 ,2 ,3 ]
机构
[1] Inria, F-54600 Villers Les Nancy, France
[2] Univ Lorraine, LORIA, UMR 7503, F-54506 Vandoeuvre Les Nancy, France
[3] CNRS, LORIA, UMR 7503, F-54506 Vandoeuvre Les Nancy, France
[4] Univ Granada, Dept TSTC, Granada, Spain
关键词
Robust ASR; acoustic modeling; DNN; uncertainty estimation; uncertainty propagation; FEATURE ENHANCEMENT; SPEECH; COMPENSATION; RECOGNITION; MODEL;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Automatic speech recognition (ASR) in noisy environments remains a challenging goal. Recently, the idea of estimating the uncertainty about the features obtained after speech enhancement and propagating it to dynamically adapt deep neural network (DNN) based acoustic models has raised some interest. However, the results in the literature were reported on simulated noisy datasets for a limited variety of uncertainty estimators. We found that they vary significantly in different conditions. Hence, the main contribution of this work is to assess DNN uncertainty decoding performance for different data conditions and different uncertainty estimation/propagation techniques. In addition, we propose a neural network based uncertainty estimator and compare it with other uncertainty estimators. We report detailed ASR results on the CHiME-2 and CHiME-3 datasets. We find that, on average, uncertainty propagation provides similar relative improvement on real and simulated data and that the proposed uncertainty estimator performs significantly better than the one in [ 1]. We also find that the improvement is consistent, but it depends on the signal-to-noise ratio (SNR) and the noise environment.
引用
收藏
页码:26 / 30
页数:5
相关论文
共 50 条
  • [1] DNN Uncertainty Propagation Using GMM-Derived Uncertainty Features for Noise Robust ASR
    Nathwani, Karan
    Vincent, Emmanuel
    Illina, Irina
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (03) : 338 - 342
  • [2] Nonparametric Uncertainty Estimation and Propagation for Noise Robust ASR
    Tran, Dung T.
    Vincent, Emmanuel
    Jouvet, Denis
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (11) : 1835 - 1846
  • [3] EXTENSION OF UNCERTAINTY PROPAGATION TO DYNAMIC MFCCS FOR NOISE ROBUST ASR
    Tran, Dung T.
    Vincent, Emmanuel
    Jouvet, Denis
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [4] CONSISTENT DNN UNCERTAINTY TRAINING AND DECODING FOR ROBUST ASR
    Nathwani, Karan
    Vincent, Emmanuel
    Illina, Irina
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 185 - 192
  • [5] DISCRIMINATIVE UNCERTAINTY ESTIMATION FOR NOISE ROBUST ASR
    Tran, Dung T.
    Vincent, Emmanuel
    Jouvet, Denis
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5038 - 5042
  • [6] FUSION OF MULTIPLE UNCERTAINTY ESTIMATORS AND PROPAGATORS FOR NOISE ROBUST ASR
    Tran, Dung T.
    Vincent, Emmanuel
    Jouvet, Denis
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] Noise robust ASR
    Viikki, O
    [J]. SPEECH COMMUNICATION, 2001, 34 (1-2) : 1 - 2
  • [8] Stochastic DNN-HMM Training for Robust ASR
    Lee, Kang Hyun
    Kang, Woo Hyun
    Lee, Hyeonseung
    Kim, Nam Soo
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 177 - 182
  • [9] Model-based feature enhancement with uncertainty decoding for noise robust ASR
    Stouten, Veronique
    Van hamme, Hugo
    Warnbacq, Patrick
    [J]. SPEECH COMMUNICATION, 2006, 48 (11) : 1502 - 1514
  • [10] An Uncertainty Propagation Approach to Robust ASR Using the ETSI Advanced Front-End
    Astudillo, Ramon Fernandez
    Kolossa, Dorothea
    Mandelartz, Philipp
    Orglmeister, Reinhold
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (05) : 824 - 833