Blind source extraction for robust speech recognition in multisource noisy environments

被引:15
|
作者
Nesta, Francesco [1 ]
Matassoni, Marco [1 ]
机构
[1] Fdn Bruno Kessler CIT Irst, I-38123 Trento, Italy
来源
COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 03期
关键词
SEPARATION;
D O I
10.1016/j.csl.2012.08.001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes and describes a complete system for Blind Source Extraction (BSE). The goal is to extract a target signal source in order to recognize spoken commands uttered in reverberant and noisy environments, and acquired by a microphone array. The architecture of the BSE system is based on multiple stages: (a) TDOA estimation, (b) mixing system identification for the target source, (c) on-line semi-blind source separation and (d) source extraction. All the stages are effectively combined, allowing the estimation of the target signal with limited distortion. While a generalization of the BSE framework is described, here the proposed system is evaluated on the data provided for the CHiME Pascal 2011 competition, i.e. binaural recordings made in a real-world domestic environment. The CHiME mixtures are processed with the BSE and the recovered target signal is fed to a recognizer, which uses noise robust features based on Gammatone Frequency Cepstral Coefficients. Moreover, acoustic model adaptation is applied to further reduce the mismatch between training and testing data and improve the overall performance. A detailed comparison between different models and algorithmic settings is reported, showing that the approach is promising and the resulting system gives a significant reduction of the error rate. (c) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:703 / 725
页数:23
相关论文
共 50 条
  • [41] Blind Source Separation of Noisy Mixed Speech Signals
    Li, Huiya
    Shi, Jianying
    Men, Jinxi
    SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS II, PTS 1 AND 2, 2014, 475-476 : 291 - +
  • [42] Perceptual features for automatic speech recognition in noisy environments
    Haque, Serajul
    Togneri, Roberto
    Zaknich, Anthony
    SPEECH COMMUNICATION, 2009, 51 (01) : 58 - 75
  • [43] Multi-band speech recognition in noisy environments
    Okawa, S
    Bocchieri, E
    Potamianos, A
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 641 - 644
  • [44] Speech Emotion Recognition Based on EMD in Noisy Environments
    Chu, Yunyun
    Xiong, Weihua
    Chen, Wei
    ADVANCES IN CIVIL ENGINEERING AND BUILDING MATERIALS III, 2014, 831 : 460 - 464
  • [45] SPEECH RECOGNITION IN NOISY ENVIRONMENTS WITH THE AID OF MICROPHONE ARRAYS
    VANCOMPERNOLLE, D
    MA, W
    XIE, F
    VANDIEST, M
    SPEECH COMMUNICATION, 1990, 9 (5-6) : 433 - 442
  • [46] TDOA ESTIMATION OF SPEECH SOURCE IN NOISY REVERBERANT ENVIRONMENTS
    Bu, Suliang
    Zhao, Tuo
    Zhao, Yunxin
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1059 - 1066
  • [47] Speech enhancement strategy for speech recognition microcontroller under noisy environments
    Chan, Kit Yan
    Nordholm, Sven
    Yiu, Ka Fai Cedric
    Togneri, Roberto
    NEUROCOMPUTING, 2013, 118 : 279 - 288
  • [48] Flexible feature extraction and HMM design for a hybrid distributed speech recognition system in noisy environments
    Stadermann, J
    Rigoll, G
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 332 - 335
  • [49] A robust algorithm for formant frequency extraction of noisy speech
    Zhao, QF
    Shimamura, T
    Suzuki, J
    ISCAS '98 - PROCEEDINGS OF THE 1998 INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-6, 1998, : D534 - D537
  • [50] A discriminative and robust training algorithm for noisy speech recognition
    Hong, WT
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 8 - 11