NOISE-ROBUST SPEECH RECOGNITION WITH EXEMPLAR-BASED SPARSE REPRESENTATIONS USING ALPHA-BETA DIVERGENCE

被引:0
|
作者
Yilmaz, Emre [1 ]
Gemmeke, Jort F. [1 ]
Van Hamme, Hugo [1 ]
机构
[1] Katholieke Univ Leuven, Dept ESAT, Leuven, Belgium
关键词
exemplar-based speech recognition; sparse representations; alpha-beta divergence; noise-robustness; NONNEGATIVE MATRIX FACTORIZATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we investigate the performance of a noise-robust sparse representations (SR)-based recognizer using the Alpha-Beta (AB)divergence to compare the noisy speech segments and exemplars. The baseline recognizer, which approximates noisy speech segments as a linear combination of speech and noise exemplars of variable length, uses the generalized Kullback-Leibler divergence to quantify the approximation quality. Incorporating a reconstruction errorbased back-end, the recognition performance highly depends on the congruence of the divergence measure and used speech features. Having two tuning parameters, namely alpha and beta, the AB-divergence provides improved robustness against background noise and outliers. These parameters can be adjusted for better performance depending on the distribution of speech and noise exemplars in the high-dimensional feature space. Moreover, various well-known distance/divergence measures such as the Euclidean distance, generalized Kullback-Leibler divergence, Itakura-Saito divergence and Hellinger distance are special cases of the AB-divergence for different (alpha, beta) values. The goal of this work is to investigate the optimal divergence for mel-scaled magnitude spectral features by performing recognition experiments at several SNR levels using different (alpha, beta) pairs. The results demonstrate the effectiveness of the AB-divergence compared to the generalized Kullback-Leibler divergence especially at the lower SNR levels.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Noise-robust automatic speech recognition using a discriminative echo state network
    Skowronski, Mark D.
    Harris, John G.
    [J]. 2007 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, 2007, : 1771 - 1774
  • [32] Noise-robust speech recognition using a new spectral estimation method "PHASOR"
    Aikawa, K
    Ishizuka, K
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 397 - 400
  • [33] EXEMPLAR-BASED LARGE VOCABULARY SPEECH RECOGNITION USING K-NEAREST NEIGHBORS
    Xu, Yanbo
    Siohan, Olivier
    Simcha, David
    Kumar, Sanjiv
    Liao, Hank
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5167 - 5171
  • [34] A Novel Model Characteristics for Noise-Robust Automatic Speech Recognition Based on HMM
    Rafieee, M. Saadeq
    Khazaei, Ali Akbar
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND INFORMATION SECURITY (WCNIS), VOL 2, 2010, : 215 - 218
  • [35] Noise-robust speech recognition in mobile network based on convolution neural networks
    Bouchakour, Lallouani
    Debyeche, Mohamed
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (01) : 269 - 277
  • [36] Fusion Feature Extraction Based on Auditory and Energy for Noise-Robust Speech Recognition
    Shi, Yanyan
    Bai, Jing
    Xue, Peiyun
    Shi, Dianxi
    [J]. IEEE ACCESS, 2019, 7 : 81911 - 81922
  • [37] Cluster-Based Pairwise Contrastive Loss for Noise-Robust Speech Recognition
    Lee, Geon Woo
    Kim, Hong Kook
    [J]. SENSORS, 2024, 24 (08)
  • [38] Noise-robust speech recognition in mobile network based on convolution neural networks
    Lallouani Bouchakour
    Mohamed Debyeche
    [J]. International Journal of Speech Technology, 2022, 25 : 269 - 277
  • [39] Knowledge Distillation-Based Training of Speech Enhancement for Noise-Robust Automatic Speech Recognition
    Woo Lee, Geon
    Kook Kim, Hong
    Kong, Duk-Jo
    [J]. IEEE ACCESS, 2024, 12 : 72707 - 72720
  • [40] A Noise-Robust Continuous Speech Recognition System Using Block-Based Dynamic Range Adjustment
    Sun, Yiming
    Miyanaga, Yoshikazu
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (03): : 844 - 852