Robust distant speaker recognition based on position-dependent CMN by combining speaker-specific GMM with speaker-adapted HMM

被引：39

作者：

Wang, Longbiao ^{[1
]}

Kitaoka, Norihide ^{[1
]}

Nakagawa, Selichi ^{[1
]}

机构：

[1] Toyohashi Univ Technol, Dept Informat & Comp Sci, Toyohashi, Aichi 4418580, Japan

来源：

SPEECH COMMUNICATION | 2007年 / 49卷 / 06期

关键词：

distant speaker recognition; GMM; HMM; position-dependent CMN; sound source estimation;

D O I：

10.1016/j.specom.2007.04.004

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose a robust speaker recognition method based on position-dependent Cepstral Mean Normalization (CMN) to compensate for the channel distortion depending on the speaker position. In the training stage, the system measures the transmission characteristics according to the speaker positions from some grid points to the microphone in the room and estimates the compensation parameters a priori. In the recognition stage, the system estimates the speaker position and adopts the estimated compensation parameters corresponding to the estimated position, and then the system applies the CMN to the speech and performs speaker recognition. In our past study, we proposed a new text-independent speaker recognition method by combining speaker-specific Gaussian mixture models (GMMs) with syllable-based HMMs adapted to the speakers by MAP [Nakagawa, S., Zhang, W., Takahashi, M., 2004. Text-independent speaker recognition by combining speaker-specific GMM with speaker-adapted syllable-based HMM. Proc. ICASSP-2004 1, 8184]. The robustness of this speaker recognition method for the change of the speaking style in close-talking environment was evaluated in (Nakagawa et al., 2004). In this paper, we extend this combination method to distant speaker recognition and integrate this method with the proposed position-dependent CMN. Our experiments showed that the proposed method improved the speaker recognition performance remarkably in a distant environment. (C) 2007 Elsevier B.V. All rights reserved.

引用

页码：501 / 513

页数：13

共 21 条

[1] Text-independent speaker recognition by combining speaker-specific GMM with speaker adapted syllable-based HMM
Nakagawa, S
Zhang, W
Takahashi, M
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 81 - 84
[2] Text-independent/text-prompted speaker recognition by combining speaker-specific GMM with speaker adapted syllable-based HMM
Nakagawa, S
Zhang, W
Takahashi, M
[J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2006, E89D (03): : 1058 - 1065
[3] Robust distant speech recognition by combining position-dependent CMN with conventional CMN
Wang, Longbiao
Kitaoka, Norihide
Nakagawa, Seiichi
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 817 - +
[4] Speaker-Specific Articulatory Feature Extraction Based on Knowledge Distillation for Speaker Recognition
Hong, Qian-Bei
Wu, Chung-Hsien
Wang, Hsin-Min
[J]. APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2023, 12 (02)
[5] Speaker Dependent, Speaker Independent and Cross Language Emotion Recognition From Speech Using GMM and HMM
Bhaykar, Manav
Yadav, Jainath
Rao, K. Sreenivasa
[J]. 2013 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2013,
[6] SPEAKER-DEPENDENT ISOLATED WORD RECOGNITION USING SPEAKER-INDEPENDENT VECTOR QUANTIZATION CODEBOOKS AUGMENTED WITH SPEAKER-SPECIFIC DATA
BURTON, DK
SHORE, JE
[J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02): : 440 - 443
[7] Signal bias removal based GMM for robust speaker recognition
Kim, YJ
Chung, JH
[J]. 2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 4163 - 4163
[8] HMM-separation-based speech recognition for a distant moving speaker
Takiguchi, T
Nakamura, S
Shikano, K
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (02): : 127 - 140
[9] Speech recognition for a distant moving speaker based on HMM composition and separation
Takiguchi, T
Nakamura, S
Shikano, K
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1403 - 1406
[10] ON COMBINING DNN AND GMM WITH UNSUPERVISED SPEAKER ADAPTATION FOR ROBUST AUTOMATIC SPEECH RECOGNITION
Liu, Shilin
Sim, Khe Chai
[J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,

← 1 2 3 →