Robust bootstrapping of speaker models for unsupervised speaker indexing

被引：0

作者：

Fu, ZhongHua ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 710072, Peoples R China

来源：

MULTIMEDIA CONTENT ANALYSIS AND MINING, PROCEEDINGS | 2007年 / 4577卷

关键词：

unsupervised speaker indexing; speaker model; eigenvoices;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The conventional bootstrapping approaches of speaker models in unsupervised speaker indexing tasks are very sensitive to the bootstrapping segment duration. If the duration is insufficient to build speaker model, such as in telephone conversations and meetings scenario, serious problems will arise. We therefore propose a robust bootstrapping framework, which employs Multi-EigenSpace modeling technique based on Regression Class (RC-MES) to build speaker models with sparse data, and a short-segment clustering to prevent the too short segments from influencing bootstrapping. For a real discussion archive with a total duration of 8 hours, we demonstrate the significant robustness of the proposed method, which not only improves the speaker change detection performance but also outperforms the conventional bootstrapping methods, even if the average bootstrapping segment duration is less than 5 seconds.

引用

页码：122 / +

页数：2

共 50 条

[1] Unsupervised speaker indexing using generic models
Kwon, S
Narayanan, S
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 1004 - 1013
[2] A study of generic models for unsupervised on-line speaker indexing
Kwon, S
Narayanan, S
[J]. ASRU'03: 2003 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING ASRU '03, 2003, : 423 - 428
[3] A simple approach to unsupervised speaker indexing
Ofoegbu, Uchechukwu O.
Iyer, Ananth N.
Yantorno, Robert E.
Smolenski, Brett Y.
[J]. 2006 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS, VOLS 1 AND 2, 2006, : 315 - 318
[4] Iterative unsupervised GMM training for speaker indexing
Paralic, Martin
Jarina, Roman
[J]. RADIOENGINEERING, 2007, 16 (03) : 138 - 144
[5] An unsupervised scheme for speaker indexing of audio databases
Chen, Yanxiang
Liu, Ming
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 3, 2009, : 90 - +
[6] Unsupervised indexing of conversations with short speaker utterances
Ofoegbu, Uchechukwu O.
Iyer, Ananth N.
Yantorno, Robert E.
Wenndt, Stanley J.
[J]. 2007 IEEE AEROSPACE CONFERENCE, VOLS 1-9, 2007, : 1555 - 1565
[7] ROBUST UNSUPERVISED SPEAKER TURN DETECTION
Teshome, Assefa Kassa
Ramalingam, C. S.
[J]. IMCIC'11: THE 2ND INTERNATIONAL MULTI-CONFERENCE ON COMPLEXITY, INFORMATICS AND CYBERNETICS, VOL II, 2011, : 200 - 203
[8] An approach to robust unsupervised speaker adaptation
Kim, NS
Seo, DJ
Lim, W
[J]. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (06) : 469 - 472
[9] Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion
Nishida, M
Kawahara, T
[J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I, 2003, : 172 - 175
[10] Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing
Nishida, M
Kawahara, T
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (04): : 583 - 592

← 1 2 3 4 5 →