INFORMATION BOTTLENECK BASED SPEAKER DIARIZATION OF MEETINGS USING NON-SPEECH AS SIDE INFORMATION

被引:0
|
作者
Yella, Sree Harsha [1 ]
Bourlard, Herve [1 ]
机构
[1] Idiap Res Inst, CH-1920 Martigny, Switzerland
关键词
speaker diarization; spontaneous meeting recordings; information bottleneck; clustering; side information;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Background noise and errors in speech/non-speech detection cause significant degradation to the output of a speaker diarization system. In a typical speaker diarization system, non-speech segments are excluded prior to unsupervised clustering. In the current study, we exploit the information present in the non-speech segments of a recording to improve the output of the speaker diarization system based on information bottleneck framework. This is achieved by providing information from non-speech segments as side (irrelevant) information to information bottleneck based clustering. Experiments on meeting recordings from RT 06, 07, 09, evaluation sets have shown that the proposed method decreases the diarization error rate by around 18% relative to the baseline speaker diarization system based on information bottleneck framework. Comparison with a state of the art system based on HMM/GMM framework shows that the proposed method significantly decreases the gap in performance between the information bottleneck system and HMM/GMM system.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Speaker Identification from Mixture of Speech and Non-speech Audio Signal
    Yasmin, Ghazaala
    Dhara, Subrata
    Mahindar, Rudrendu
    Das, Asit Kumar
    [J]. SOFT COMPUTING IN DATA ANALYTICS, SCDA 2018, 2019, 758 : 473 - 482
  • [42] Sparse DNN-based speaker segmentation using side information
    Ma, Yong
    Bao, Chang-Chun
    [J]. ELECTRONICS LETTERS, 2015, 51 (08) : 651 - 653
  • [43] Probabilistic Speaker Diarization With Bag-of-Words Representations of Speaker Angle Information
    Ishiguro, Katsuhiko
    Yamada, Takeshi
    Araki, Shoko
    Nakatani, Tomohiro
    Sawada, Hiroshi
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 447 - 460
  • [44] Speaker diarization for multi-party meetings using acoustic fusion
    Anguera, X
    Wooters, C
    Hernando, J
    [J]. 2005 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), 2005, : 426 - 431
  • [45] MULTI-CHANNEL SPEAKER DIARIZATION USING SPATIAL FEATURES FOR MEETINGS
    Zheng, Naijun
    Li, Na
    Yu, JianWei
    Weng, Chao
    Su, Dan
    Liu, XunYing
    Meng, Helen
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7337 - 7341
  • [46] Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
    Hung, Hayley
    Huang, Yan
    Friedland, Gerald
    Gatica-Perez, Daniel
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 847 - 860
  • [47] Information access using speech, speaker and face recognition
    Viswanathan, M
    Beigi, HSM
    Tritschler, A
    Maali, F
    [J]. 2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 493 - 496
  • [48] Speaker localization using excitation source information in speech
    Raykar, VC
    Yegnanarayana, B
    Prasanna, SRM
    Duraiswami, R
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (05): : 751 - 761
  • [49] The Segmental Bayesian Information Criterion and Its Applications to Speaker Diarization
    Stafylakis, Themos
    Katsouros, Vassilis
    Carayannis, George
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2010, 4 (05) : 857 - 866
  • [50] An Information Theoretic Combination of MFCC and TDOA Features for Speaker Diarization
    Vijayasenan, Deepu
    Valente, Fabio
    Bourlard, Herve
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (02): : 431 - 438