Speaker indexing and adaptation using speaker clustering based on statistical model selection

被引:0
|
作者
Nishida, M [1 ]
Kawahara, T [1 ]
机构
[1] Chiba Univ, Grad Sch Sci & Technol, Inage Ku, Chiba 2638522, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper addresses unsupervised speaker indexing and automatic speech recognition of discussions. In speaker indexing, there are two cases, where the number of speakers is unknown and known beforehand. When the specified number is unknown, it is difficult to apply to various data because it needs to determine several parameters like threshold. In addition, serious problems arise in applying a uniform model because variations in the utterance durations of speakers are large. We thus propose a method which can robustly perform speaker indexing for the two cases using a flexible framework in which an optimal speaker model (GMM or VQ) is selected based on the BIC. Moreover, we propose a combination method of speaker adaptation based on speaker selection and the indexing method. For real discussion archives, we demonstrated that indexing performance is higher than that of conventional methods for the two cases and speech recognition performance was improved by the combination method.
引用
收藏
页码:353 / 356
页数:4
相关论文
共 50 条
  • [1] Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion
    Nishida, M
    Kawahara, T
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 172 - 175
  • [2] Speaker adaptation for telephony data using speaker clustering
    Wu, C
    Lubensky, D
    Wang, ZH
    [J]. 2000 5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I-III, 2000, : 768 - 771
  • [3] Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing
    Nishida, M
    Kawahara, T
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 583 - 592
  • [4] UBM based speaker selection and model re-estimation for speaker adaptation
    Wang, Jian
    Guo, Jun
    Liu, Gang
    Lei, Jianjun
    [J]. PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 856 - 860
  • [5] Kernel-based speaker clustering for rapid speaker adaptation
    Hazrati, Dooz
    Ahadi, S. M.
    Sadjadi, Omid
    [J]. PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2008, : 1287 - 1289
  • [6] SVM based speaker selection using GMM supervector for rapid speaker adaptation
    Wang, Jian
    Lei, Jianjun
    Guo, Jun
    Yang, Zhen
    [J]. SIMULATED EVOLUTION AND LEARNING, PROCEEDINGS, 2006, 4247 : 617 - 624
  • [7] Rapid Unsupervised Speaker Adaptation Using Single Utterance Based on MLLR and Speaker Selection
    Gomez, Randy
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1365 - 1368
  • [8] Speaker Adaptation Using i-Vector Based Clustering
    Kim, Minsoo
    Jang, Gil-Jin
    Kim, Ji-Hwan
    Lee, Minho
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (07): : 2785 - 2799
  • [9] EFFICIENT SPEAKER IDENTIFICATION USING DISTRIBUTIONAL SPEAKER MODEL CLUSTERING
    Apsingekar, Vijendra Raj
    De Leon, Phillip L.
    [J]. 2008 42ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, VOLS 1-4, 2008, : 1260 - 1264
  • [10] Speaker Clustering Performance Improvement using Eigen-Voice Speaker Adaptation
    Moattar, M. H.
    Homayounpour, M. M.
    [J]. 2009 14TH INTERNATIONAL COMPUTER CONFERENCE, 2009, : 500 - 505