Spectrum enhancement with sparse coding for robust speech recognition

被引:11
|
作者
He, Yongjun [1 ]
Sun, Guanglu [1 ]
Han, Jiqing [2 ]
机构
[1] Harbin Univ Sci & Technol, Harbin 150080, Peoples R China
[2] Harbin Inst Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Sparse coding; Speech denoising; Residual noise; Basis pursuit denoising; JOINT COMPENSATION; REPRESENTATION; NOISE; ADAPTATION; REGRESSION; EQUATIONS; FEATURES; SYSTEMS;
D O I
10.1016/j.dsp.2015.04.014
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, a trend in speech recognition is to introduce sparse coding for noise robustness. Although several methods have been proposed, the performance of sparse coding in speech denoising is not so optimistic. One assumption with sparse coding is that the representation of speech over the speech dictionary is sparse, while that of the noise is dense. This assumption is obviously not sustained in the speech denoising scenario. Many noises are also sparse over the speech dictionary. In such a condition, the representation of noisy speech still contains noise components, resulting in degraded performance. To solve this problem, we first analyze the assumption of sparse coding and then propose a novel method to enhance speech spectrum. This method first finds out the atoms which represent the noise sparsely, and then selectively ignores them in the reconstruction of speech to reduce the residual noise. Speech features are then extracted from the enhanced spectrum for speech recognition. Experimental results show that the proposed method can improve the noise robustness of a speech recognition system substantially. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:59 / 70
页数:12
相关论文
共 50 条
  • [31] Speech Emotion Recognition Based on Robust Discriminative Sparse Regression
    Song, Peng
    Zheng, Wenming
    Yu, Yanwei
    Ou, Shifeng
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2021, 13 (02) : 343 - 353
  • [32] Robust emotion recognition in noisy speech via sparse representation
    Zhao, Xiaoming
    Zhang, Shiqing
    Lei, Bicheng
    NEURAL COMPUTING & APPLICATIONS, 2014, 24 (7-8): : 1539 - 1553
  • [33] Performance Analysis of Speech Enhancement Algorithm for Robust Speech Recognition System
    Babu, C. Ganesh
    Vanathi, P. T.
    Ramachandran, R.
    Rajaa, M. Senthil
    RECENT ADVANCES IN NETWORKING, VLSI AND SIGNAL PROCESSING, 2010, : 197 - +
  • [34] Time-Domain Speech Enhancement for Robust Automatic Speech Recognition
    Yang, Yufeng
    Pandey, Ashutosh
    Wang, DeLiang
    INTERSPEECH 2023, 2023, : 4913 - 4917
  • [35] Comparative Evaluation of Speech Enhancement Methods for Robust Automatic Speech Recognition
    Paliwal, Kuldip K.
    Lyons, James G.
    So, Stephen
    Stark, Anthony P.
    Wojcicki, Kamil K.
    2010 4TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ICSPCS), 2010,
  • [36] Robust emotion recognition in noisy speech via sparse representation
    Xiaoming Zhao
    Shiqing Zhang
    Bicheng Lei
    Neural Computing and Applications, 2014, 24 : 1539 - 1553
  • [37] Combining speech enhancement and auditory feature extraction for robust speech recognition
    Kleinschmidt, M
    Tchorz, J
    Kollmeier, B
    SPEECH COMMUNICATION, 2001, 34 (1-2) : 75 - 91
  • [38] EXPLORING SPEECH ENHANCEMENT WITH GENERATIVE ADVERSARIAL NETWORKS FOR ROBUST SPEECH RECOGNITION
    Donahue, Chris
    Li, Bo
    Prabhavalkar, Rohit
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5024 - 5028
  • [39] Combined speech enhancement and auditory modelling for robust distributed speech recognition
    Flynn, Ronan
    Jones, Edward
    SPEECH COMMUNICATION, 2008, 50 (10) : 797 - 809
  • [40] Speech enhancement with a GSC-like structure employing sparse coding
    Yang, Li-chun
    Qian, Yun-tao
    JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS, 2014, 15 (12): : 1154 - 1163