A multi-channel speech enhancement framework for robust NMF-based speech recognition for speech-impaired users

被引:0
|
作者
Dekkers, Gert [1 ,2 ,4 ]
van Waterschoot, Toon [1 ,2 ]
Vanrumste, Bart [1 ,2 ,4 ]
Van Den Broeck, Bert [1 ,2 ,4 ]
Gemmeke, Jort F. [3 ]
Van Hamme, Hugo [3 ]
Karsmakers, Peter [1 ,2 ,4 ]
机构
[1] KU Leuven TC Geel, ESAT ETC AdvISe, Kleinhoefstr 4, B-2440 Geel, Belgium
[2] Katholieke Univ Leuven, ESAT STADIUS, B-3001 Leuven, Belgium
[3] Katholieke Univ Leuven, ESAT PSI, B-3001 Leuven, Belgium
[4] iMinds, Med IT, B-3001 Leuven, Belgium
关键词
multi-channel speech enhancement; speech recognition; uncertainty of estimation; dysarthric speech; INTEGRATION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper a multi-channel speech enhancement framework for distant speech acquisition in noisy and reverberant environments for Non-negative Matrix Factorization (NMF)-based Automatic Speech Recognition (ASR) is proposed. The system is evaluated for its use in an assistive vocal interface for physically impaired and speech-impaired users. The framework utilises the Spatially Pre-processed Speech Distortion Weighted Multi-channel Wiener Filter (SP-SDW-MWF) in combination with a postfilter to reduce noise and reverberation. Additionally, the estimation uncertainty of the speech enhancement framework is propagated through the Mel-Frequency Cepstrum Coefficients (MFCC) feature extraction to allow for feature compensation in a later stage. Results indicate that a) using a trade-off parameter between noise reduction and speech distortion has a positive effect on the recognition performance with respect to the well-known GSC and MWF and b) the addition of a post filter and the feature compensation increases performance with respect to several baselines for a non-pathological and pathological speaker.
引用
下载
收藏
页码:746 / 750
页数:5
相关论文
共 50 条
  • [1] Combined Multi-channel NMF-based Robust Beamforming for Noisy Speech Recognition
    Mimura, Masato
    Bando, Yoshiaki
    Shimada, Kazuki
    Sakai, Shinsuke
    Yoshii, Kazuyoshi
    Kawahara, Tatsuya
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2451 - 2455
  • [2] Multi-Channel Speech Enhancement and Amplitude Modulation Analysis for Noise Robust Automatic Speech Recognition
    Moritz, Niko
    Adiloglu, Kamil
    Anemueller, Joern
    Goetze, Stefan
    Kollmeier, Birger
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 558 - 573
  • [3] Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement
    Taherian, Hassan
    Wang, Zhong-Qiu
    Chang, Jorge
    Wang, DeLiang
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1293 - 1302
  • [4] NMF-based Cepstral Features for Speech Emotion Recognition
    Lashkari, Milad
    Seyedin, Sanaz
    2018 4TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2018, : 189 - 193
  • [5] NMF-Based Speech Enhancement Using Bases Update
    Kwon, Kisoo
    Shin, Jong Won
    Kim, Nam Soo
    IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (04) : 450 - 454
  • [6] A SUPERVISED MULTI-CHANNEL SPEECH ENHANCEMENT ALGORITHM BASED ON BAYESIAN NMF MODEL
    Chung, Hanwook
    Plourde, Eric
    Champagne, Benoit
    2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, : 221 - 225
  • [7] Multi-Channel Feature Adaptation for Robust Speech Recognition
    Zhang, Zhaofeng
    Xiao, Xiong
    Wang, Longbiao
    Dang, Jianwu
    Iwahashi, Masahiro
    Chng, Eng Siong
    Li, Haizhou
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [8] NMF-Based Speech Enhancement Using Multitaper Spectrum Estimation
    Attabi, Yazid
    Chung, Hanwook
    Champagne, Benoit
    Zhu, Wei-Ping
    2018 INTERNATIONAL CONFERENCE ON SIGNALS AND SYSTEMS (ICSIGSYS), 2018, : 36 - 41
  • [9] Eigenvector-Based Speech Mask Estimation for Multi-Channel Speech Enhancement
    Pfeifenberger, Lukas
    Zoehrer, Matthias
    Pernkopf, Franz
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2019, 27 (12) : 2162 - 2172
  • [10] A separation and interaction framework for causal multi-channel speech enhancement
    Liu, Wenzhe
    Li, Andong
    Zheng, Chengshi
    Li, Xiaodong
    DIGITAL SIGNAL PROCESSING, 2022, 126