NOISE-ROBUST SPEECH RECOGNITION WITH EXEMPLAR-BASED SPARSE REPRESENTATIONS USING ALPHA-BETA DIVERGENCE

被引:0
|
作者
Yilmaz, Emre [1 ]
Gemmeke, Jort F. [1 ]
Van Hamme, Hugo [1 ]
机构
[1] Katholieke Univ Leuven, Dept ESAT, Leuven, Belgium
关键词
exemplar-based speech recognition; sparse representations; alpha-beta divergence; noise-robustness; NONNEGATIVE MATRIX FACTORIZATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we investigate the performance of a noise-robust sparse representations (SR)-based recognizer using the Alpha-Beta (AB)divergence to compare the noisy speech segments and exemplars. The baseline recognizer, which approximates noisy speech segments as a linear combination of speech and noise exemplars of variable length, uses the generalized Kullback-Leibler divergence to quantify the approximation quality. Incorporating a reconstruction errorbased back-end, the recognition performance highly depends on the congruence of the divergence measure and used speech features. Having two tuning parameters, namely alpha and beta, the AB-divergence provides improved robustness against background noise and outliers. These parameters can be adjusted for better performance depending on the distribution of speech and noise exemplars in the high-dimensional feature space. Moreover, various well-known distance/divergence measures such as the Euclidean distance, generalized Kullback-Leibler divergence, Itakura-Saito divergence and Hellinger distance are special cases of the AB-divergence for different (alpha, beta) values. The goal of this work is to investigate the optimal divergence for mel-scaled magnitude spectral features by performing recognition experiments at several SNR levels using different (alpha, beta) pairs. The results demonstrate the effectiveness of the AB-divergence compared to the generalized Kullback-Leibler divergence especially at the lower SNR levels.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] SPARSE IMPUTATION FOR NOISE ROBUST SPEECH RECOGNITION USING SOFT MASKS
    Gemmeke, J. F.
    Cranen, B.
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4645 - 4648
  • [42] Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition
    Zhao, Yong
    Juang, Biing-Hwang
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (08): : 2191 - 2206
  • [43] An Efficient Noise-Robust Automatic Speech Recognition System using Artificial Neural Networks
    Gupta, Santosh
    Bhurchandi, Kishor M.
    Keskar, Avinash G.
    [J]. 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 1873 - 1877
  • [44] INTEGRATED DNN-BASED MODEL ADAPTATION TECHNIQUE FOR NOISE-ROBUST SPEECH RECOGNITION
    Lee, Kang Hyun
    Kang, Woo Hyun
    Kang, Tae Gyoon
    Kim, Nam Soo
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5245 - 5249
  • [45] Histogram equalization for noise-robust speech recognition using discrete-mixture HMMs
    Kosaka, Tetsuo
    Katoh, Masaharu
    Kohda, Masaki
    [J]. ACOUSTICAL SCIENCE AND TECHNOLOGY, 2008, 29 (01) : 66 - 73
  • [46] Agricultural price information acquisition using noise-robust Mandarin auto speech recognition
    Xu J.
    Zhu Y.
    Xu P.
    Ma D.
    [J]. International Journal of Speech Technology, 2018, 21 (3) : 681 - 688
  • [47] Speech Enhancement Based on Teacher-Student Deep Learning Using Improved Speech Presence Probability for Noise-Robust Speech Recognition
    Tu, Yan-Hui
    Du, Jun
    Lee, Chin-Hui
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2080 - 2091
  • [48] A Spectral Masking Approach to Noise-Robust Speech Recognition Using Deep Neural Networks
    Li, Bo
    Sim, Khe Chai
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (08) : 1296 - 1305
  • [49] An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
    Bhiksha Raj
    Lorenzo Turicchia
    Bent Schmidt-Nielsen
    Rahul Sarpeshkar
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2007
  • [50] An FFT-Based Companding Front End for Noise-Robust Automatic Speech Recognition
    Raj, Bhiksha
    Turicchia, Lorenzo
    Schmidt-Nielsen, Bent
    Sarpeshkar, Rahul
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2007, 2007 (1)