Speaker clustering for speech recognition using vocal tract parameters

被引:10
|
作者
Naito, M
Deng, L
Sagisaka, Y
机构
[1] ATR, Interpreting Telephony Res Labs, Kyoto 6190288, Japan
[2] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
关键词
vocal tract parameters; speaker-clustering; speech recognition;
D O I
10.1016/S0167-6393(00)00089-3
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose speaker clustering methods for speech recogition based on vocal tract (VT) size related articulatory parameters associated with individual speakers. Two parameters characterizing gross VT dimensions are first derived from the formant frequencies of two vowels and are then used to cluster speakers. The resulting speaker clusters are significantly different from speaker clusters obtained by conventional acoustic criteria. Then phoneme recognition experiments are carried out by using speaker-clustered HMMs (SC-HMMs) trained for each cluster. The proposed method requires a small amount of speech data for speaker clustering and for selecting the most suitable SC-HMM for a target speaker, but gives higher recognition rates than conventional speaker clustering methods based on acoustic criteria. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:305 / 315
页数:11
相关论文
共 50 条
  • [41] Vocal Source Contribution to Speaker Recognition
    Sorokin V.N.
    Sorokin, V.N. (vns@iitp.ru), 2018, Pleiades journals (28) : 546 - 556
  • [42] Experiments on using Vocal Tract Estimates of Nasal Stops for Speaker Verification
    Enzinger, Ewald
    Kasess, Christian H.
    2013 7TH CONFERENCE ON SPEECH TECHNOLOGY AND HUMAN - COMPUTER DIALOGUE (SPED), 2013,
  • [43] AUTOMATIC SPEAKER AUTHENTICATION USING SPEECH RECOGNITION TECHNIQUES
    MEEKER, WF
    MARTIN, TB
    HERSCHER, MB
    PHYFE, D
    WEINSTOCK, M
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1967, 42 (05): : 1182 - &
  • [44] Speaker identification and speech recognition using phased arrays
    Xu, Roger
    Mei, Gang
    Ren, ZuBing
    Kwan, Chiman
    Aube, Julien
    Rochet, Cedrick
    Stanford, Vincent
    AMBIENT INTELLIGENCE IN EVERDAY LIFE, 2006, 3864 : 227 - 238
  • [45] Information access using speech, speaker and face recognition
    Viswanathan, M
    Beigi, HSM
    Tritschler, A
    Maali, F
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 493 - 496
  • [46] Multimedia document retrieval using speech and speaker recognition
    Viswanathan M.
    Beigi H.S.M.
    Dharanipragada S.
    Maali F.
    Tritschler A.
    International Journal on Document Analysis and Recognition, 2000, 2 (04) : 147 - 162
  • [47] SPEAKER ADAPTATION USING SPECTRAL INTERPOLATION FOR SPEECH RECOGNITION
    SHINODA, K
    ISO, KI
    WATANABE, T
    ELECTRONICS AND COMMUNICATIONS IN JAPAN PART III-FUNDAMENTAL ELECTRONIC SCIENCE, 1994, 77 (10): : 1 - 11
  • [48] Speaker Recognition using Excitation Source Parameters
    Kamarauskas, J.
    Salna, B.
    ELEKTRONIKA IR ELEKTROTECHNIKA, 2011, (01) : 55 - 58
  • [49] Speaker Independent Urdu Speech Recognition Using HMM
    Ashraf, Javed
    Iqbal, Naveed
    Khattak, Naveed Sarfraz
    Zaidi, Ather Mohsin
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, 2010, 6177 : 140 - 148
  • [50] Distributed speaker recognition using the ETSI distributed speech recognition standard
    Broun, CC
    Campbell, WM
    Pearce, D
    Kelleher, H
    IC-AI'2001: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS I-III, 2001, : 244 - 248