Blind source extraction for robust speech recognition in multisource noisy environments

被引：15

作者：

Nesta, Francesco ^{[1
]}

Matassoni, Marco ^{[1
]}

机构：

[1] Fdn Bruno Kessler CIT Irst, I-38123 Trento, Italy

来源：

COMPUTER SPEECH AND LANGUAGE | 2013年 / 27卷 / 03期

关键词：

SEPARATION;

D O I：

10.1016/j.csl.2012.08.001

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper proposes and describes a complete system for Blind Source Extraction (BSE). The goal is to extract a target signal source in order to recognize spoken commands uttered in reverberant and noisy environments, and acquired by a microphone array. The architecture of the BSE system is based on multiple stages: (a) TDOA estimation, (b) mixing system identification for the target source, (c) on-line semi-blind source separation and (d) source extraction. All the stages are effectively combined, allowing the estimation of the target signal with limited distortion. While a generalization of the BSE framework is described, here the proposed system is evaluated on the data provided for the CHiME Pascal 2011 competition, i.e. binaural recordings made in a real-world domestic environment. The CHiME mixtures are processed with the BSE and the recovered target signal is fed to a recognizer, which uses noise robust features based on Gammatone Frequency Cepstral Coefficients. Moreover, acoustic model adaptation is applied to further reduce the mismatch between training and testing data and improve the overall performance. A detailed comparison between different models and algorithmic settings is reported, showing that the approach is promising and the resulting system gives a significant reduction of the error rate. (c) 2012 Elsevier Ltd. All rights reserved.

引用

页码：703 / 725

页数：23

共 50 条

[41] Blind Source Separation of Noisy Mixed Speech Signals
Li, Huiya
Shi, Jianying
Men, Jinxi
SENSORS, MEASUREMENT AND INTELLIGENT MATERIALS II, PTS 1 AND 2, 2014, 475-476 : 291 - +
[42] Perceptual features for automatic speech recognition in noisy environments
Haque, Serajul
Togneri, Roberto
Zaknich, Anthony
SPEECH COMMUNICATION, 2009, 51 (01) : 58 - 75
[43] Multi-band speech recognition in noisy environments
Okawa, S
Bocchieri, E
Potamianos, A
PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 641 - 644
[44] Speech Emotion Recognition Based on EMD in Noisy Environments
Chu, Yunyun
Xiong, Weihua
Chen, Wei
ADVANCES IN CIVIL ENGINEERING AND BUILDING MATERIALS III, 2014, 831 : 460 - 464
[45] SPEECH RECOGNITION IN NOISY ENVIRONMENTS WITH THE AID OF MICROPHONE ARRAYS
VANCOMPERNOLLE, D
MA, W
XIE, F
VANDIEST, M
SPEECH COMMUNICATION, 1990, 9 (5-6) : 433 - 442
[46] TDOA ESTIMATION OF SPEECH SOURCE IN NOISY REVERBERANT ENVIRONMENTS
Bu, Suliang
Zhao, Tuo
Zhao, Yunxin
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1059 - 1066
[47] Speech enhancement strategy for speech recognition microcontroller under noisy environments
Chan, Kit Yan
Nordholm, Sven
Yiu, Ka Fai Cedric
Togneri, Roberto
NEUROCOMPUTING, 2013, 118 : 279 - 288
[48] Flexible feature extraction and HMM design for a hybrid distributed speech recognition system in noisy environments
Stadermann, J
Rigoll, G
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 332 - 335
[49] A robust algorithm for formant frequency extraction of noisy speech
Zhao, QF
Shimamura, T
Suzuki, J
ISCAS '98 - PROCEEDINGS OF THE 1998 INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-6, 1998, : D534 - D537
[50] A discriminative and robust training algorithm for noisy speech recognition
Hong, WT
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 8 - 11

← 1 2 3 4 5 →