Speaker clustering for speech recognition using vocal tract parameters

被引：10

作者：

Naito, M

Deng, L

Sagisaka, Y

机构：

[1] ATR, Interpreting Telephony Res Labs, Kyoto 6190288, Japan

[2] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada

来源：

SPEECH COMMUNICATION | 2002年 / 36卷 / 3-4期

关键词：

vocal tract parameters; speaker-clustering; speech recognition;

D O I：

10.1016/S0167-6393(00)00089-3

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We propose speaker clustering methods for speech recogition based on vocal tract (VT) size related articulatory parameters associated with individual speakers. Two parameters characterizing gross VT dimensions are first derived from the formant frequencies of two vowels and are then used to cluster speakers. The resulting speaker clusters are significantly different from speaker clusters obtained by conventional acoustic criteria. Then phoneme recognition experiments are carried out by using speaker-clustered HMMs (SC-HMMs) trained for each cluster. The proposed method requires a small amount of speech data for speaker clustering and for selecting the most suitable SC-HMM for a target speaker, but gives higher recognition rates than conventional speaker clustering methods based on acoustic criteria. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：305 / 315

页数：11

共 50 条

[41] Vocal Source Contribution to Speaker Recognition
Sorokin V.N.
Sorokin, V.N. (vns@iitp.ru), 2018, Pleiades journals (28) : 546 - 556
[42] Experiments on using Vocal Tract Estimates of Nasal Stops for Speaker Verification
Enzinger, Ewald
Kasess, Christian H.
2013 7TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN - COMPUTER DIALOGUE (SPED), 2013,
[43] AUTOMATIC SPEAKER AUTHENTICATION USING SPEECH RECOGNITION TECHNIQUES
MEEKER, WF
MARTIN, TB
HERSCHER, MB
PHYFE, D
WEINSTOCK, M
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1967, 42 (05): : 1182 - &
[44] Speaker identification and speech recognition using phased arrays
Xu, Roger
Mei, Gang
Ren, ZuBing
Kwan, Chiman
Aube, Julien
Rochet, Cedrick
Stanford, Vincent
AMBIENT INTELLIGENCE IN EVERDAY LIFE, 2006, 3864 : 227 - 238
[45] Information access using speech, speaker and face recognition
Viswanathan, M
Beigi, HSM
Tritschler, A
Maali, F
2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 493 - 496
[46] Multimedia document retrieval using speech and speaker recognition
Viswanathan M.
Beigi H.S.M.
Dharanipragada S.
Maali F.
Tritschler A.
International Journal on Document Analysis and Recognition, 2000, 2 (04) : 147 - 162
[47] SPEAKER ADAPTATION USING SPECTRAL INTERPOLATION FOR SPEECH RECOGNITION
SHINODA, K
ISO, KI
WATANABE, T
ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 1994, 77 (10): : 1 - 11
[48] Speaker Recognition using Excitation Source Parameters
Kamarauskas, J.
Salna, B.
ELEKTRONIKA IR ELEKTROTECHNIKA, 2011, (01) : 55 - 58
[49] Speaker Independent Urdu Speech Recognition Using HMM
Ashraf, Javed
Iqbal, Naveed
Khattak, Naveed Sarfraz
Zaidi, Ather Mohsin
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 6177 : 140 - 148
[50] Distributed speaker recognition using the ETSI distributed speech recognition standard
Broun, CC
Campbell, WM
Pearce, D
Kelleher, H
IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 244 - 248

← 1 2 3 4 5 →