Purity algorithms for speaker diarization of meetings data

被引:0
|
作者
Anguera, Xavier [1 ]
Wooters, Chuck [1 ]
Hernando, Javier [1 ]
机构
[1] ICSI, Berkeley, CA 94704 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
When performing speaker diarization, it is common to use an agglomerative clustering approach where the acoustic data is first split in small pieces and then pairs are merged until reaching a stopping point. When using a purely agglomerative clustering technique, one cluster cannot be split into two. Therefore, errors caused by multiple speakers being assigned to one cluster can be common. Furthermore, clusters often contain non-speech frames, creating problems when deciding which two clusters to merge and when to stop the clustering. In this paper, we present two algorithms that aim to purify the clusters. The first assigns conflicting speech segments to a new cluster, and the second detects and eliminates non-speech frames when comparing two clusters. We show improvements of over 18% relative using three datasets from the most current Rich Transcription (RT) evaluations.
引用
收藏
页码:1025 / 1028
页数:4
相关论文
共 50 条
  • [21] Automatic weighting for the combination of TDOA and acoustic features in speaker diarization for meetings
    Anguera, Xavier
    Wooters, Chuck
    Pardo, Jose M.
    Hernando, Javier
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 241 - +
  • [22] Multistream speaker diarization of meetings recordings beyond MFCC and TDOA features
    Vijayasenan, Deepu
    Valente, Fabio
    Bourlard, Herve
    [J]. SPEECH COMMUNICATION, 2012, 54 (01) : 55 - 67
  • [23] Novel Architectures for Unsupervised Information Bottleneck Based Speaker Diarization of Meetings
    Dawalatabad, Nauman
    Madikeri, Srikanth
    Sekhar, C. Chandra
    Murthy, Hema A.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 14 - 27
  • [24] Estimating Dominance in Multi-Party Meetings Using Speaker Diarization
    Hung, Hayley
    Huang, Yan
    Friedland, Gerald
    Gatica-Perez, Daniel
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 847 - 860
  • [25] Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings
    Yella, Sree Harsha
    Valente, Fabio
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 960 - 963
  • [26] Robust speaker diarization for meetings: ICSI RT06S meetings evaluation system
    Anguera, Xavier
    Wooters, Chuck
    Pardo, Jose M.
    [J]. MACHINE LEARNING FOR MULTIMODAL INTERACTION, 2006, 4299 : 346 - +
  • [27] Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge
    Liu, Yi
    Tian, Yao
    He, Liang
    Liu, Jia
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 853 - 857
  • [28] ADAPTIVE AND ONLINE SPEAKER DIARIZATION FOR MEETING DATA
    Soldi, Giovanni
    Beaugeant, Christophe
    Evans, Nicholas
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2112 - 2116
  • [29] The IBM RT07 evaluation systems for speaker diarization on lecture meetings
    Huang, Jing
    Marcheret, Etienne
    Visweswariah, Karthik
    Potamianos, Gerasimos
    [J]. MULTIMODAL TECHNOLOGIES FOR PERCEPTION OF HUMANS, 2008, 4625 : 497 - 508
  • [30] Filter bank design for speaker diarization based on genetic algorithms
    Charbuillet, C.
    Gas, B.
    Chetouani, M.
    Zarader, J. L.
    [J]. 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, Vols 1-13, 2006, : 673 - 676