Robust bootstrapping of speaker models for unsupervised speaker indexing

被引:0
|
作者
Fu, ZhongHua [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China
关键词
unsupervised speaker indexing; speaker model; eigenvoices;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The conventional bootstrapping approaches of speaker models in unsupervised speaker indexing tasks are very sensitive to the bootstrapping segment duration. If the duration is insufficient to build speaker model, such as in telephone conversations and meetings scenario, serious problems will arise. We therefore propose a robust bootstrapping framework, which employs Multi-EigenSpace modeling technique based on Regression Class (RC-MES) to build speaker models with sparse data, and a short-segment clustering to prevent the too short segments from influencing bootstrapping. For a real discussion archive with a total duration of 8 hours, we demonstrate the significant robustness of the proposed method, which not only improves the speaker change detection performance but also outperforms the conventional bootstrapping methods, even if the average bootstrapping segment duration is less than 5 seconds.
引用
收藏
页码:122 / +
页数:2
相关论文
共 50 条
  • [1] Unsupervised speaker indexing using generic models
    Kwon, S
    Narayanan, S
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 1004 - 1013
  • [2] A study of generic models for unsupervised on-line speaker indexing
    Kwon, S
    Narayanan, S
    [J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 423 - 428
  • [3] A simple approach to unsupervised speaker indexing
    Ofoegbu, Uchechukwu O.
    Iyer, Ananth N.
    Yantorno, Robert E.
    Smolenski, Brett Y.
    [J]. 2006 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1 AND 2, 2006, : 315 - 318
  • [4] Iterative unsupervised GMM training for speaker indexing
    Paralic, Martin
    Jarina, Roman
    [J]. RADIOENGINEERING, 2007, 16 (03) : 138 - 144
  • [5] An unsupervised scheme for speaker indexing of audio databases
    Chen, Yanxiang
    Liu, Ming
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 3, 2009, : 90 - +
  • [6] Unsupervised indexing of conversations with short speaker utterances
    Ofoegbu, Uchechukwu O.
    Iyer, Ananth N.
    Yantorno, Robert E.
    Wenndt, Stanley J.
    [J]. 2007 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2007, : 1555 - 1565
  • [7] ROBUST UNSUPERVISED SPEAKER TURN DETECTION
    Teshome, Assefa Kassa
    Ramalingam, C. S.
    [J]. IMCIC'11: THE 2ND INTERNATIONAL MULTI-CONFERENCE ON COMPLEXITY, INFORMATICS AND CYBERNETICS, VOL II, 2011, : 200 - 203
  • [8] An approach to robust unsupervised speaker adaptation
    Kim, NS
    Seo, DJ
    Lim, W
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (06) : 469 - 472
  • [9] Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion
    Nishida, M
    Kawahara, T
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 172 - 175
  • [10] Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing
    Nishida, M
    Kawahara, T
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 583 - 592