The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022

被引:0
|
作者
Liu, Tao [1 ]
Xiang, Xu [2 ]
Chen, Zhengyang [1 ]
Han, Bing [1 ]
Yu, Kai [1 ]
Qian, Yanmin [1 ]
机构
[1] Shanghai Jiao Tong Univ, AI Inst, MoE Key Lab Artificial Intelligence, X LANCE Lab, Shanghai, Peoples R China
[2] AISpeech Ltd, Suzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
speaker diarization; conversational; short-phrase;
D O I
10.1109/ISCSLP57327.2022.10037955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes X-Lance Speaker Diarization System submitted to the Conversational Short-phrase Speaker Diarization Challenge. The system outputs the ensemble results of the four modules: self-attentive-based VAD, uniform segmentation, ECAPA-TDNN-based embedding extractor, and spectral clustering. We evaluated our system on the Conversational Short-phrase Speaker Diarization (CSSD) dataset, which is based on MagicData-RAMC and contains plenty of conversational short-phrase segments. Besides being different from other diarization challenges, the challenge proposes a metric called Conversational Diarization Error Rate (CDER), which focuses on evaluating short segments. In this paper, we will analyze this metric and conduct extensive experiments. Finally, our system achieves CDER of 13.2% and 8.0% in the CSSD_dev and unseen CSSD eval set, respectively.
引用
收藏
页码:498 / 501
页数:4
相关论文
共 50 条
  • [1] TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge
    Pang, Bowen
    Zhao, Huan
    Zhang, Gaosheng
    Yang, Xiaoyue
    Sun, Yang
    Zhang, Li
    Wang, Qing
    Xie, Lei
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 502 - 506
  • [2] The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines
    Cheng, Gaofeng
    Chen, Yifan
    Yang, Runyan
    Li, Qingxuan
    Yang, Zehui
    Ye, Lingxuan
    Zhang, Pengyuan
    Zhang, Qingqing
    Xie, Lei
    Qian, Yanmin
    Lee, Kong Aik
    Yan, Yonghong
    [J]. 2022 13TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2022, : 488 - 492
  • [3] Conversational Short-Phrase Speaker Diarization via Self-Adjusting Speech Segmentation and Embedding Extraction
    Lu, Haitian
    Cheng, Gaofeng
    Yan, Yonghong
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2340 - 2344
  • [4] The BUCEA Speaker Diarization System for the VoxCeleb Speaker Recognition Challenge 2022
    Zhou, Ruohua
    Du, Yuxuan
    Hu, Chenlei
    [J]. arXiv, 2022,
  • [5] MICROSOFT SPEAKER DIARIZATION SYSTEM FOR THE VOXCELEB SPEAKER RECOGNITION CHALLENGE 2020
    Xiao, Xiong
    Kanda, Naoyuki
    Chen, Zhuo
    Zhou, Tianyan
    Yoshioka, Takuya
    Chen, Sanyuan
    Zhao, Yong
    Liu, Gang
    Wu, Yu
    Wu, Jian
    Liu, Shujie
    Li, Jinyu
    Gong, Yifan
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5824 - 5828
  • [6] An Improved Speaker Diarization System
    Fu, Rong
    Benest, Ian D.
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1253 - 1256
  • [7] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. INTERSPEECH 2019, 2019, : 988 - 992
  • [8] Speaker Diarization with Deep Speaker Embeddings for DIHARD Challenge II
    Novoselov, Sergey
    Gusev, Aleksei
    Ivanov, Artem
    Pekhovsky, Timur
    Shulipa, Andrey
    Avdeeva, Anastasia
    Gorlanov, Artem
    Kozlov, Alexandr
    [J]. INTERSPEECH 2019, 2019, : 1003 - 1007
  • [9] Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments
    Baghel, Shikha
    Ramoji, Shreyas
    Jain, Somil
    Chowdhuri, Pratik Roy
    Singh, Prachi
    Vijayasenan, Deepu
    Ganapathy, Sriram
    [J]. SPEECH COMMUNICATION, 2024, 161
  • [10] IMPROVED SPEAKER DIARIZATION SYSTEM FOR MEETINGS
    El-Khoury, Elie
    Senac, Christine
    Pinquier, Julien
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4097 - 4100