Speaker and Noise Factorization for Robust Speech Recognition

被引：36

作者：

Wang, Yongqiang ^{[1
]}

Gales, Mark J. F. ^{[1
]}

机构：

[1] Univ Cambridge, Dept Engn, Cambridge CB2 1PZ, England

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2012年 / 20卷 / 07期

关键词：

Acoustic factorization; noise robustness; speaker adaptation; vector Taylor series (VTS); HIDDEN MARKOV-MODELS; COMPENSATION; ADAPTATION;

D O I：

10.1109/TASL.2012.2198059

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Speech recognition systems need to operate in a wide range of conditions. Thus they should be robust to extrinsic variability caused by various acoustic factors, for example speaker differences, transmission channel and background noise. For many scenarios, multiple factors simultaneously impact the underlying "clean" speech signal. This paper examines techniques to handle both speaker and background noise differences. An acoustic factorization approach is adopted. Here, separate transforms are assigned to represent the speaker [maximum-likelihood linear regression (MLLR)], and noise and channel [model-based vector Taylor series (VTS)] factors. This is a highly flexible framework compared to the standard approaches of modeling the combined impact of both speaker and noise factors. For example factorization allows the speaker characteristics obtained in one noise condition to be applied to a different environment. To obtain this factorization modified versions of MLLR and VTS training and application are derived. The proposed scheme is evaluated for both adaptation and factorization on the AURORA4 data.

引用

页码：2149 / 2158

页数：10

共 50 条

[1] Noise robust estimate of speech dynamics for speaker recognition
Openshaw, JP
Mason, JS
[J]. ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4, 1996, : 925 - 928
[2] Efficient Speaker and Noise Normalization for Robust Speech Recognition
Joshi, Vikas
Bilgi, Raghavendra
Umesh, S.
Benitez, C.
Garcia, L.
[J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2612 - 2615
[3] RAPID JOINT SPEAKER AND NOISE COMPENSATION FOR ROBUST SPEECH RECOGNITION
Chin, K. K.
Xu, Haitian
Gales, Mark J. F.
Breslin, Catherine
Knill, Kate
[J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5500 - 5503
[4] Speaker normalized spectral subband parameters for noise robust speech recognition
Tsuge, S
Fukada, T
Singer, H
[J]. ICASSP '99: 1999 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS VOLS I-VI, 1999, : 285 - 288
[5] Noise Suppression based on nonnegative matrix factorization for robust speech recognition
Fan, Hao-teng
Lin, Pao-han
Hung, Jeih-weih
[J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1731 - +
[6] An integrated study of speaker normalisation and HMM adaptation for noise robust speaker-independent speech recognition
Hariharan, R
Viikki, O
[J]. SPEECH COMMUNICATION, 2002, 37 (3-4) : 349 - 361
[7] MULTILEVEL SPEECH INTELLIGIBILITY FOR ROBUST SPEAKER RECOGNITION
Nemala, Sridhar Krishna
Elhilali, Mounya
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4393 - 4396
[8] NONNEGATIVE MATRIX FACTORIZATION BASED NOISE ROBUST SPEAKER VERIFICATION
Liu, S. H.
Zou, Y. X.
Ning, H. K.
[J]. 2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING, 2015, : 35 - 39
[9] Noise Robust Voice Detector for Speaker Recognition
Hernandez, Gabriel
Calvo, Jose R.
Fernandez, Rafael
Rodes, Ivis
Martinez, Rafael
[J]. 19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2605 - 2608
[10] Noise robust speaker identification for spontaneous Arabic speech
Graciarena, Martin
Kajarekar, Sachin
Stolcke, Andreas
Shriberg, Elizabeth
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 245 - +

← 1 2 3 4 5 →