An Iterative Speaker Re-Diarization Scheme for Improving Speaker-Based Entity Extraction in Multimedia Archives

被引:0
|
作者
Ghaemmaghami, Houman [1 ]
Dean, David [1 ]
Sridharan, Sridha [1 ]
机构
[1] Queensland Univ Technol, Speech & Audio Res Lab, Brisbane, Qld, Australia
基金
澳大利亚研究理事会;
关键词
speaker re-diarization; diarization; speaker linking; complete-linkage clustering; cross-likelihood ratio;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a novel scheme for improving speaker diarization by making use of repeating speakers across multiple recordings within a large corpus. We call this technique speaker re-diarization and demonstrate that it is possible to reuse the initial speaker-linked diarization outputs to boost diarization accuracy within individual recordings. We first propose and evaluate two novel re-diarization techniques. We demonstrate their complementary characteristics and fuse the two techniques to successfully conduct speaker re-diarization across the SAIVT-BNEWS corpus of Australian broadcast data. This corpus contains recurring speakers in various independent recordings that need to be linked across the dataset. We show that our speaker re-diarization approach can provide a relative improvement of 23% in diarization error rate (DER), over the original diarization results, as well as improve the estimated number of speakers and the cluster purity and coverage metrics.
引用
收藏
页码:577 / 581
页数:5
相关论文
共 10 条
  • [1] A SPEAKER REDIARIZATION SCHEME FOR IMPROVING DIARIZATION IN LARGE TWO-SPEAKER TELEPHONE DATASETS
    Ghaemmaghami, Houman
    Dean, David
    Sridharan, Sridha
    [J]. 2014 PROCEEDINGS OF THE 22ND EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2014, : 1272 - 1276
  • [2] IMPROVING SEPARATION-BASED SPEAKER DIARIZATION VIA ITERATIVE MODEL REFINEMENT AND SPEAKER EMBEDDING BASED POST-PROCESSING
    Niu, Shu-Tong
    Du, Jun
    Sun, Lei
    Lee, Chin-Hui
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8387 - 8391
  • [3] Diarization-based Speaker Retrieval for Broadcast Television Archives
    Huijbregts, Marijn
    van Leeuwen, David
    [J]. 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1044 - 1047
  • [4] Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations
    Ben-Harush, Oshry
    Ben-Harush, Ortal
    Lapidot, Itshak
    Guterman, Hugo
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2012, 20 (02): : 414 - 425
  • [5] ATTENTION-BASED NEURAL NETWORK FOR JOINT DIARIZATION AND SPEAKER EXTRACTION
    Chazan, Shlomo E.
    Gannot, Sharon
    Goldberger, Jacob
    [J]. 2018 16TH INTERNATIONAL WORKSHOP ON ACOUSTIC SIGNAL ENHANCEMENT (IWAENC), 2018, : 301 - 305
  • [6] Randomization Effect on Iterative-Based Speaker Diarization System for Telephone Conversations
    Furmanov, Tal
    Aminov, Lidiya
    Moyal, Ami
    Lapidot, Itshak
    [J]. 2014 IEEE 28TH CONVENTION OF ELECTRICAL & ELECTRONICS ENGINEERS IN ISRAEL (IEEEI), 2014,
  • [7] Robust Speaker Extraction Network based on Iterative Refined Adaptation
    Deng, Chengyun
    Ma, Shiqian
    Sha, Yongtao
    Zhang, Yi
    Zhang, Hui
    Song, Hui
    Wang, Fei
    [J]. INTERSPEECH 2021, 2021, : 3530 - 3534
  • [8] Audio-visual Speaker Diarization: Improved Voice Activity Detection with CNN based Feature Extraction
    Fanaras, Konstantinos
    Tragoudaras, Antonios
    Antoniadis, Charalampos
    Massoud, Yehia
    [J]. 2022 IEEE 65TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS 2022), 2022,
  • [9] End-to-End Neural Speaker Diarization with an Iterative Refinement of Non-Autoregressive Attention-based Attractors
    Rybicka, Magdalena
    Villalba, Jesus
    Dehak, Najim
    Kowalczyk, Konrad
    [J]. INTERSPEECH 2022, 2022, : 5090 - 5094
  • [10] Speaker Extraction using LCMV Beamformer with DNN-based SPP and RTF Identification Scheme
    Malek, Ariel
    Chazan, Shlomo E.
    Malka, Ilan
    Tourbabin, Vladimir
    Goldberger, Jacob
    Tzirkel-Hancock, Eli
    Gannot, Sharon
    [J]. 2017 25TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2017, : 2274 - 2278