Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion

被引:0
|
作者
Nishida, M [1 ]
Kawahara, T [1 ]
机构
[1] JST, PRESTO, Sakyo Ku, Kyoto 6068501, Japan
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper addresses unsupervised speaker indexing for discussion Audio archives. In discussions, the speaker changes frequently, thus the duration of utterances is very short and its variation is large, which causes significant problems in applying conventional methods such as model adaptation and Variance-BIC (Bayesian Information Criterion) methods. We propose a flexible framework that selects an optimal speaker model (GMM or VQ) based on the BIC according to the duration of utterances. When the speech segment is short, the simple and robust VQ-based method is expected to be chosen, while GMM will be reliably trained for long segments. For a discussion archive having a total duration of 10 hours; it is demonstrated that the proposed method achieves higher indexing performance than that of conventional methods.
引用
收藏
页码:172 / 175
页数:4
相关论文
共 50 条
  • [31] Novel Architectures for Unsupervised Information Bottleneck Based Speaker Diarization of Meetings
    Dawalatabad, Nauman
    Madikeri, Srikanth
    Sekhar, C. Chandra
    Murthy, Hema A.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 14 - 27
  • [32] A critique of the Bayesian information criterion for model selection
    Weakliem, DL
    SOCIOLOGICAL METHODS & RESEARCH, 1999, 27 (03) : 359 - 397
  • [33] A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures
    Hattori, Takafumi
    Hashimoto, Kei
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1106 - 1109
  • [34] Automatic speaker change detection with the Bayesian Information Criterion using MPEG-7 features and a fusion scheme
    Kotti, Margarita
    Benetos, Emmanouil
    Kotropoulos, Constantine
    2006 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-11, PROCEEDINGS, 2006, : 1856 - +
  • [35] Spatial features selection for unsupervised speaker segmentation and clustering
    Martinez-Gonzalez, Beatriz
    Pardo, Jose M.
    Echeverry-Correa, Julian D.
    San-Segundo, Ruben
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 73 : 27 - 42
  • [36] UNSUPERVISED SPEAKER ADAPTATION USING ATTENTION-BASED SPEAKER MEMORY FOR END-TO-END ASR
    Sari, Leda
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7384 - 7388
  • [37] Variational Bayesian Model Selection for GMM-Speaker Verification using Universal Background Model.
    Pekhovsky, Timur
    Lokhanova, Alexandra
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2716 - 2719
  • [38] A General Bayesian Model for Speaker Verification
    XU Yunfei
    YANG Hai
    YANG Lin
    ZHOU Ruohua
    YAN Yonghong
    Chinese Journal of Electronics, 2016, 25 (06) : 1045 - 1051
  • [39] A General Bayesian Model for Speaker Verification
    Xu Yunfei
    Yang Hai
    Yang Lin
    Zhou Ruohua
    Yan Yonghong
    CHINESE JOURNAL OF ELECTRONICS, 2016, 25 (06) : 1045 - 1051
  • [40] Unsupervised Speaker Adaptation of BLSTM-RNN for LVCSR Based on Speaker Code
    Huang, Zhiying
    Xue, Shaofei
    Yan, Zhijie
    Dai, Lirong
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,