Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion

被引:0
|
作者
Nishida, M [1 ]
Kawahara, T [1 ]
机构
[1] JST, PRESTO, Sakyo Ku, Kyoto 6068501, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper addresses unsupervised speaker indexing for discussion Audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information Criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive having a total duration of 10 hours; it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods.
引用
收藏
页码:172 / 175
页数:4
相关论文
共 50 条
  • [1] Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing
    Nishida, M
    Kawahara, T
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 583 - 592
  • [2] A Bayesian Information Criterion Based Approach for Model Complexity Selection in Speaker Identification
    Geng, Yun-Xiao
    Wu, Wei
    [J]. ALPIT 2008: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCED LANGUAGE PROCESSING AND WEB INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 264 - 268
  • [3] Speaker indexing and adaptation using speaker clustering based on statistical model selection
    Nishida, M
    Kawahara, T
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 353 - 356
  • [4] Speaker Clustering Based on Bayesian Information Criterion
    Tsai, Wei-Ho
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2008, 24 (06) : 1873 - 1886
  • [5] Unsupervised speaker indexing using generic models
    Kwon, S
    Narayanan, S
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 1004 - 1013
  • [6] Robust bootstrapping of speaker models for unsupervised speaker indexing
    Fu, ZhongHua
    [J]. MULTIMEDIA CONTENT ANALYSIS AND MINING, PROCEEDINGS, 2007, 4577 : 122 - +
  • [7] Redefining the Bayesian Information Criterion for Speaker Diarisation
    Stafylakis, Themos
    Katsouros, Vassilis
    Carayannis, George
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1055 - 1058
  • [8] Rapid Unsupervised Speaker Adaptation Using Single Utterance Based on MLLR and Speaker Selection
    Gomez, Randy
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1365 - 1368
  • [9] Modeling speaker variation in cues to prominence using the Bayesian information criterion
    Mahrt, Tim
    Cole, Jennifer
    Fleck, Margaret
    Hasegawa-Johnson, Mark
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON SPEECH PROSODY, VOLS I AND II, 2012, : 322 - 325
  • [10] A simple approach to unsupervised speaker indexing
    Ofoegbu, Uchechukwu O.
    Iyer, Ananth N.
    Yantorno, Robert E.
    Smolenski, Brett Y.
    [J]. 2006 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1 AND 2, 2006, : 315 - 318