Blind source extraction for robust speech recognition in multisource noisy environments

被引:15
|
作者
Nesta, Francesco [1 ]
Matassoni, Marco [1 ]
机构
[1] Fdn Bruno Kessler CIT Irst, I-38123 Trento, Italy
来源
COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 03期
关键词
SEPARATION;
D O I
10.1016/j.csl.2012.08.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes and describes a complete system for Blind Source Extraction (BSE). The goal is to extract a target signal source in order to recognize spoken commands uttered in reverberant and noisy environments, and acquired by a microphone array. The architecture of the BSE system is based on multiple stages: (a) TDOA estimation, (b) mixing system identification for the target source, (c) on-line semi-blind source separation and (d) source extraction. All the stages are effectively combined, allowing the estimation of the target signal with limited distortion. While a generalization of the BSE framework is described, here the proposed system is evaluated on the data provided for the CHiME Pascal 2011 competition, i.e. binaural recordings made in a real-world domestic environment. The CHiME mixtures are processed with the BSE and the recovered target signal is fed to a recognizer, which uses noise robust features based on Gammatone Frequency Cepstral Coefficients. Moreover, acoustic model adaptation is applied to further reduce the mismatch between training and testing data and improve the overall performance. A detailed comparison between different models and algorithmic settings is reported, showing that the approach is promising and the resulting system gives a significant reduction of the error rate. (c) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:703 / 725
页数:23
相关论文
共 50 条
  • [21] Robust Arabic speech recognition in noisy environments using prosodic features and formant
    Amrous, Anissa
    Debyeche, Mohamed
    Amrouche, Abderrahman
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2011, 14 (04) : 351 - 359
  • [22] Auditory processing of speech signals for robust speech recognition in real-world noisy environments
    Kim, DS
    Lee, SY
    Kil, RM
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (01): : 55 - 69
  • [23] An effective cluster-based model for robust speech detection and speech recognition in noisy environments
    Górriz, J.M.
    Ramírez, J.
    Segura, J.C.
    Puntonet, C.G.
    Journal of the Acoustical Society of America, 2006, 120 (01): : 470 - 481
  • [24] Speech enhancement applied to speech recognition in noisy environments
    Xu, Y.F., 2001, Press of Tsinghua University (41):
  • [25] A robust speech enhancement method in noisy environments
    Abajaddi, Nesrine
    Mounir, Badia
    Elfahm, Youssef
    Farchi, Abdelmajid
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2023, 14 (09) : 973 - 983
  • [26] Robust recognition of noisy speech using speech enhancement
    Xu, YF
    Zhang, JJ
    Yao, KS
    Cao, ZG
    Ma, ZX
    2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 734 - 737
  • [27] Speech recognition in multisource reverberant environments with binaural inputs
    Roman, Nicoleta
    Srinivasan, Soundararajan
    Wang, DeLiang
    2006 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-13, 2006, : 309 - 312
  • [28] Speech Emotion Recognition in Noisy and Reverberant Environments
    Heracleous, Panikos
    Yasuda, Keiji
    Sugaya, Fumiaki
    Yoneyama, Akio
    Hashimoto, Masayuki
    2017 SEVENTH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2017, : 262 - 266
  • [29] Multisensory benefits for speech recognition in noisy environments
    Oh, Yonghee
    Schwalm, Meg
    Kalpin, Nicole
    FRONTIERS IN NEUROSCIENCE, 2022, 16
  • [30] Speech Recognition On Mobile Devices In Noisy Environments
    Yurtcan, Yaser
    Kilic, Banu Gunel
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,