Speaker clustering for speech recognition using vocal tract parameters

被引:10
|
作者
Naito, M
Deng, L
Sagisaka, Y
机构
[1] ATR, Interpreting Telephony Res Labs, Kyoto 6190288, Japan
[2] Univ Waterloo, Dept Elect & Comp Engn, Waterloo, ON N2L 3G1, Canada
关键词
vocal tract parameters; speaker-clustering; speech recognition;
D O I
10.1016/S0167-6393(00)00089-3
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We propose speaker clustering methods for speech recogition based on vocal tract (VT) size related articulatory parameters associated with individual speakers. Two parameters characterizing gross VT dimensions are first derived from the formant frequencies of two vowels and are then used to cluster speakers. The resulting speaker clusters are significantly different from speaker clusters obtained by conventional acoustic criteria. Then phoneme recognition experiments are carried out by using speaker-clustered HMMs (SC-HMMs) trained for each cluster. The proposed method requires a small amount of speech data for speaker clustering and for selecting the most suitable SC-HMM for a target speaker, but gives higher recognition rates than conventional speaker clustering methods based on acoustic criteria. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:305 / 315
页数:11
相关论文
共 50 条
  • [21] Emotional Speech Clustering based Robust Speaker Recognition System
    Li, Dongdong
    Yang, Yingchun
    PROCEEDINGS OF THE 2009 2ND INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOLS 1-9, 2009, : 4576 - +
  • [22] A study on speaker normalization using vocal tract normalization and speaker adaptive training
    Welling, L
    Haeb-Umbach, R
    Aubert, X
    Haberland, N
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 797 - 800
  • [23] Speaker verification using complementary information from vocal source and vocal tract
    Zheng, Nengheng
    Wang, Ning
    Lee, Tan
    Ching, P. C.
    CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2006, 4274 : 518 - +
  • [24] Speaker clustering and transformation for speaker adaptation in large-vocabulary speech recognition systems
    Padmanabhan, M
    Bahl, LR
    Nahamoo, D
    Picheny, MA
    1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, CONFERENCE PROCEEDINGS, VOLS 1-6, 1996, : 701 - 704
  • [25] Speaker-independent speech recognition based on tree-structured speaker clustering
    Kosaka, T
    Matsunaga, S
    Sagayama, S
    COMPUTER SPEECH AND LANGUAGE, 1996, 10 (01): : 55 - 74
  • [26] Vocal tract length invariant features for automatic speech recognition
    Mertins, A
    Rademacher, J
    2005 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2005, : 308 - 312
  • [27] USING CLUSTERING COMPARISON MEASURES FOR SPEAKER RECOGNITION
    Kua, Jia Min Karen
    Epps, Julien
    Nosratighods, Mohaddeseh
    Ambikairajah, Eliathamby
    Choi, Eric
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5452 - 5455
  • [28] ESTIMATION OF VOCAL TRACT PARAMETERS FOR THE CLASSIFICATION OF SPEECH UNDER STRESS
    Yao, Xiao
    Jitsuhiro, Takatoshi
    Miyajima, Chiyomi
    Kitaoka, Norihide
    Takeda, Kazuya
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 7532 - 7536
  • [29] SPEAKER CLUSTERING USING VECTOR REPRESENTATION WITH LONG-TERM FEATURE FOR LECTURE SPEECH RECOGNITION
    Huang, Chien-Lin
    Hori, Chiori
    Kashioka, Hideki
    Ma, Bin
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 3532 - 3536
  • [30] Acoustic-phonetic speech parameters for speaker-independent speech recognition
    Deshmukh, O
    Espy-Wilson, CY
    Juneja, A
    2002 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-IV, PROCEEDINGS, 2002, : 593 - 596