TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge

被引:2
|
作者
Pang, Bowen [1 ]
Zhao, Huan [1 ]
Zhang, Gaosheng [2 ]
Yang, Xiaoyue [2 ]
Sun, Yang [2 ]
Zhang, Li [1 ]
Wang, Qing [1 ]
Xie, Lei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Audio Speech & Language Proc Grp ASLP NPU, Xian, Peoples R China
[2] Shenzhen Transs Holding Ltd, Shenzhen, Peoples R China
关键词
speaker diarization; spectral clustering; TS-VAD; EEND;
D O I
10.1109/ISCSLP57327.2022.10037846
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes the TSUP team's submission to the ISCSLP 2022 conversational short-phrase speaker diarization (CSSD) challenge which particularly focuses on short-phrase conversations with a new evaluation metric called conversational diarization error rate (CDER). In this challenge, we explore three kinds of typical speaker diarization systems, which are spectral clustering (SC) based diarization, target-speaker voice activity detection (TS-VAD) and end-to-end neural diarization (EEND) respectively. Our major findings are summarized as follows. First, the SC approach is more favored over the other two approaches under the new CDER metric. Second, tuning on hyperparameters is essential to CDER for all three types of speaker diarization systems. Specifically, CDER becomes smaller when the length of sub-segments setting longer. Finally, multi-system fusion through DOVER-LAP will worsen the CDER metric on the challenge data. Our submitted SC system eventually ranks the third place in the challenge.
引用
收藏
页码:502 / 506
页数:5
相关论文
共 50 条
  • [31] Speech Enhancement for Multimodal Speaker Diarization System
    Ahmad, Rehan
    Zubair, Syed
    Alquhayz, Hani
    [J]. IEEE ACCESS, 2020, 8 : 126671 - 126680
  • [32] IMPROVED BINARY KEY SPEAKER DIARIZATION SYSTEM
    Delgado, Hector
    Anguera, Xavier
    Fredouille, Corinne
    Serrano, Javier
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 2087 - 2091
  • [33] FULLY SUPERVISED SPEAKER DIARIZATION
    Zhang, Aonan
    Wang, Quan
    Zhu, Zhenyao
    Paisley, John
    Wang, Chong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6301 - 6305
  • [34] Speaker Diarization with Lexical Information
    Park, Tae Jin
    Han, Kyu J.
    Huang, Jing
    He, Xiaodong
    Zhou, Bowen
    Georgiou, Panayiotis
    Narayanan, Shrikanth
    [J]. INTERSPEECH 2019, 2019, : 391 - 395
  • [35] A Cluster Purification Algorithm for Speaker Diarization System
    Xiang, Zhang
    [J]. 2014 SEVENTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2014), VOL 2, 2014,
  • [36] Multi-Channel Conversational Speaker Separation via Neural Diarization
    Taherian, Hassan
    Wang, DeLiang
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2467 - 2476
  • [37] Speaker count: a new building block for speaker diarization
    Duong, Thanh Thi-Hien
    Nguyen, Phi-Le
    Nguyen, Hong-Son
    Nguyen, Duc-Chien
    Phan, Huy
    Duong, Ngoc Q. K.
    [J]. 2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 1149 - 1155
  • [38] Bayes Factor Based Speaker Segmentation for Speaker Diarization
    Wang, D.
    Vogt, R.
    Sridharan, S.
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 1405 - 1408
  • [39] Factor Analysis for Speaker Segmentation and Improved Speaker Diarization
    Desplanques, Brecht
    Demuynck, Kris
    Martens, Jean-Pierre
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3081 - 3085
  • [40] Exploring methods of improving speaker accuracy for speaker diarization
    Knox, Mary Tai
    Mirghafori, Nikki
    Friedland, Gerald
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2782 - 2786