LEAP Submission for the Third DIHARD Diarization Challenge

被引:0
|
作者
Singh, Prachi [1 ]
Varma, Rajat [1 ]
Krishnamohan, Venkat [1 ]
Chetupalli, Srikanth Raj [1 ]
Ganapathy, Sriram [1 ]
机构
[1] Indian Inst Sci, Learning & Extract Acoust Patterns LEAP Lab, Elect Engn, Bangalore, Karnataka, India
来源
关键词
speaker diarization; end-to-end system; x-vectors; path integral clustering;
D O I
10.21437/Interspeech.2021-728
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The LEAP submission for DIHARD-III challenge is described in this paper. The proposed system is composed of a speech bandwidth classifier, and diarization systems fine-tuned for narrowband and wideband speech separately. We use an end-to-end speaker diarization system for the narrowband conversational telephone speech recordings. For the wideband multispeaker recordings, we use a neural embedding based clustering approach, similar to the baseline system. The embeddings are extracted from a time-delay neural network (called x-vectors) followed by the graph based path integral clustering (PIC) approach. The LEAP system showed 24% and 18% relative improvements for Track-1 and Track-2 respectively over the baseline system provided by the organizers. This paper describes the challenge submission, the post-evaluation analysis and improvements observed on the DIHARD-III dataset.
引用
收藏
页码:3545 / 3549
页数:5
相关论文
共 50 条
  • [1] The Third DIHARD Diarization Challenge
    Ryant, Neville
    Singh, Prachi
    Krishnamohan, Venkat
    Varma, Rajat
    Church, Kenneth
    Cieri, Christopher
    Du, Jun
    Ganapathy, Sriram
    Liberman, Mark
    [J]. INTERSPEECH 2021, 2021, : 3570 - 3574
  • [2] LEAP Diarization System for the Second DIHARD Challenge
    Singh, Prachi
    Vardhan, Harsha M. A.
    Ganapathy, Sriram
    Kanagasundaram, Ahilan
    [J]. INTERSPEECH 2019, 2019, : 983 - 987
  • [3] BUT SYSTEM FOR THE SECOND DIHARD SPEECH DIARIZATION CHALLENGE
    Landini, Federico
    Wang, Shuai
    Diez, Mireia
    Burget, Lukas
    Matejka, Pavel
    Zmolikova, Katerina
    Mosner, Ladislav
    Silnova, Anna
    Plchot, Oldrich
    Novotny, Ondrej
    Zeinali, Hossein
    Rohdin, Johan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6529 - 6533
  • [4] BUT system for DIHARD Speech Diarization Challenge 2018
    Diez, Mireia
    Landini, Federico
    Burget, Lukas
    Rohdin, Johan
    Silnova, Anna
    Zmolikova, Katerina
    Novotny, Ondrej
    Vesely, Karel
    Glembek, Ondrej
    Plchot, Oldrich
    Mosner, Ladislav
    Matejka, Pavel
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2798 - 2802
  • [5] The EURECOM submission to the first DIHARD Challenge
    Patino, Jose
    Delgado, Hector
    Evans, Nicholas
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2813 - 2817
  • [6] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. INTERSPEECH 2019, 2019, : 988 - 992
  • [7] Speaker Diarization with Enhancing Speech for the First DIHARD Challenge
    Sun, Lei
    Du, Jun
    Jiang, Chao
    Zhang, Xueyang
    He, Shan
    Yin, Bing
    Lee, Chin-Hui
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2793 - 2797
  • [8] The Second DIHARD Diarization Challenge: Dataset, task, and baselines
    Ryant, Neville
    Church, Kenneth
    Cieri, Christopher
    Cristia, Alejandrina
    Du, Jun
    Ganapathy, Sriram
    Liberman, Mark
    [J]. INTERSPEECH 2019, 2019, : 978 - 982
  • [9] INVESTIGATING DEEP NEURAL NETWORKS FOR SPEAKER DIARIZATION IN THE DIHARD CHALLENGE
    Himawan, Ivan
    Rahman, Md Hafizur
    Sridharan, Sridha
    Fookes, Clinton
    Kanagasundaram, Ahilan
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1029 - 1035
  • [10] An Analysis of Speaker Diarization Fusion Methods For The First DIHARD Challenge
    Yin, Bing
    Du, Jun
    Sun, Lei
    Zhang, Xueyang
    He, Shan
    Ling, Zhenhua
    Hu, Guoping
    Guo, Wu
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1473 - 1477