LEAP Submission for the Third DIHARD Diarization Challenge

被引:0
|
作者
Singh, Prachi [1 ]
Varma, Rajat [1 ]
Krishnamohan, Venkat [1 ]
Chetupalli, Srikanth Raj [1 ]
Ganapathy, Sriram [1 ]
机构
[1] Indian Inst Sci, Learning & Extract Acoust Patterns LEAP Lab, Elect Engn, Bangalore, Karnataka, India
来源
关键词
speaker diarization; end-to-end system; x-vectors; path integral clustering;
D O I
10.21437/Interspeech.2021-728
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The LEAP submission for DIHARD-III challenge is described in this paper. The proposed system is composed of a speech bandwidth classifier, and diarization systems fine-tuned for narrowband and wideband speech separately. We use an end-to-end speaker diarization system for the narrowband conversational telephone speech recordings. For the wideband multispeaker recordings, we use a neural embedding based clustering approach, similar to the baseline system. The embeddings are extracted from a time-delay neural network (called x-vectors) followed by the graph based path integral clustering (PIC) approach. The LEAP system showed 24% and 18% relative improvements for Track-1 and Track-2 respectively over the baseline system provided by the organizers. This paper describes the challenge submission, the post-evaluation analysis and improvements observed on the DIHARD-III dataset.
引用
收藏
页码:3545 / 3549
页数:5
相关论文
共 50 条
  • [11] Speaker Diarization with Deep Speaker Embeddings for DIHARD Challenge II
    Novoselov, Sergey
    Gusev, Aleksei
    Ivanov, Artem
    Pekhovsky, Timur
    Shulipa, Andrey
    Avdeeva, Anastasia
    Gorlanov, Artem
    Kozlov, Alexandr
    [J]. INTERSPEECH 2019, 2019, : 1003 - 1007
  • [12] Estimation of the Number of Speakers with Variational Bayesian PLDA in the DIHARD Diarization Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2803 - 2807
  • [13] Scenario-Dependent Speaker Diarization for DIHARD-III Challenge
    Wang, Yu-Xuan
    Du, Jun
    He, Mao-Kui
    Niu, Shu-Tong
    Sun, Lei
    Lee, Chin-Hui
    [J]. INTERSPEECH 2021, 2021, : 3106 - 3110
  • [14] ZCU-NTIS Speaker Diarization System for the DIHARD 2018 Challenge
    Zajic, Zbynek
    Kunesova, Marie
    Zelinka, Jan
    Hruz, Marek
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2788 - 2792
  • [15] UWB-NTIS Speaker Diarization System for the DIHARD II 2019 Challenge
    Zajic, Zbynek
    Kunesova, Marie
    Hruz, Marek
    Vanek, Jan
    [J]. INTERSPEECH 2019, 2019, : 993 - 997
  • [16] Diarization is Hard: Some Experiences and Lessons Learned for the JHU Team in the Inaugural DIHARD Challenge
    Sell, Gregory
    Snyder, David
    McCree, Alan
    Garcia-Romero, Daniel
    Villalba, Jesus
    Maciejewski, Matthew
    Manohar, Vimal
    Dehak, Najim
    Povey, Daniel
    Watanabe, Shinji
    Khudanpur, Sanjeev
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2808 - 2812
  • [17] Joint Discriminative Embedding Learning, Speech Activity and Overlap Detection for the DIHARD Speaker Diarization Challenge
    Miasato Filho, Valter A.
    Silva, Diego A.
    Cuozzo, Luis Gustavo D.
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2818 - 2822
  • [18] OPTIMIZING BAYESIAN HMM BASED X-VECTOR CLUSTERING FOR THE SECOND DIHARD SPEECH DIARIZATION CHALLENGE
    Diez, Mireia
    Burget, Lukas
    Landini, Federico
    Wang, Shuai
    Cernocky, Honza
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6519 - 6523
  • [19] DEEP LEARNING METHODS FOR UNSUPERVISED ACOUSTIC MODELING - LEAP SUBMISSION TO ZEROSPEECH CHALLENGE 2017
    Ansari, T. K.
    Kumar, Rajath
    Singh, Sonali
    Ganapathy, Sriram
    [J]. 2017 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2017, : 754 - 761
  • [20] ANALYSIS OF THE BUT DIARIZATION SYSTEM FOR VOXCONVERSE CHALLENGE
    Landini, Federico
    Glembek, Ondrej
    Matejka, Pavel
    Rohdin, Johan
    Burget, Lukas
    Diez, Mireia
    Silnova, Anna
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5819 - 5823