INFORMATION BOTTLENECK BASED SPEAKER DIARIZATION OF MEETINGS USING NON-SPEECH AS SIDE INFORMATION

被引：0

作者：

Yella, Sree Harsha ^{[1
]}

Bourlard, Herve ^{[1
]}

机构：

[1] Idiap Res Inst, CH-1920 Martigny, Switzerland

来源：

2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年

关键词：

speaker diarization; spontaneous meeting recordings; information bottleneck; clustering; side information;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Background noise and errors in speech/non-speech detection cause significant degradation to the output of a speaker diarization system. In a typical speaker diarization system, non-speech segments are excluded prior to unsupervised clustering. In the current study, we exploit the information present in the non-speech segments of a recording to improve the output of the speaker diarization system based on information bottleneck framework. This is achieved by providing information from non-speech segments as side (irrelevant) information to information bottleneck based clustering. Experiments on meeting recordings from RT 06, 07, 09, evaluation sets have shown that the proposed method decreases the diarization error rate by around 18% relative to the baseline speaker diarization system based on information bottleneck framework. Comparison with a state of the art system based on HMM/GMM framework shows that the proposed method significantly decreases the gap in performance between the information bottleneck system and HMM/GMM system.

引用

页数：5

共 50 条

[41] Speaker Identification from Mixture of Speech and Non-speech Audio Signal
Yasmin, Ghazaala
Dhara, Subrata
Mahindar, Rudrendu
Das, Asit Kumar
[J]. SOFT COMPUTING IN DATA ANALYTICS, SCDA 2018, 2019, 758 : 473 - 482
[42] Sparse DNN-based speaker segmentation using side information
Ma, Yong
Bao, Chang-Chun
[J]. ELECTRONICS LETTERS, 2015, 51 (08) : 651 - 653
[43] Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information
Ishiguro, Katsuhiko
Yamada, Takeshi
Araki, Shoko
Nakatani, Tomohiro
Sawada, Hiroshi
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 447 - 460
[44] Speaker diarization for multi-party meetings using acoustic fusion
Anguera, X
Wooters, C
Hernando, J
[J]. 2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 426 - 431
[45] MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
Zheng, Naijun
Li, Na
Yu, JianWei
Weng, Chao
Su, Dan
Liu, XunYing
Meng, Helen
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7337 - 7341
[46] Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
Hung, Hayley
Huang, Yan
Friedland, Gerald
Gatica-Perez, Daniel
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 847 - 860
[47] Information access using speech, speaker and face recognition
Viswanathan, M
Beigi, HSM
Tritschler, A
Maali, F
[J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 493 - 496
[48] Speaker localization using excitation source information in speech
Raykar, VC
Yegnanarayana, B
Prasanna, SRM
Duraiswami, R
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 751 - 761
[49] The Segmental Bayesian Information Criterion and Its Applications to Speaker Diarization
Stafylakis, Themos
Katsouros, Vassilis
Carayannis, George
[J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (05) : 857 - 866
[50] An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization
Vijayasenan, Deepu
Valente, Fabio
Bourlard, Herve
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 431 - 438

← 1 2 3 4 5 →