The Third DIHARD Diarization Challenge

被引:18
|
作者
Ryant, Neville [1 ]
Singh, Prachi [2 ]
Krishnamohan, Venkat [2 ]
Varma, Rajat [2 ]
Church, Kenneth [3 ]
Cieri, Christopher [1 ]
Du, Jun [4 ]
Ganapathy, Sriram [2 ]
Liberman, Mark [1 ]
机构
[1] Univ Penn, Linguist Data Consortium, Philadelphia, PA 19104 USA
[2] Indian Inst Sci, LEAP Lab, Elect Engn, Bangalore, Karnataka, India
[3] Baidu Res, Sunnyvale, CA USA
[4] Univ Sci & Technol China, Hefei, Peoples R China
来源
关键词
speaker diarization; speaker recognition; robust ASR; noise; conversational speech; DIHARD challenge; SYSTEM; INDEX;
D O I
10.21437/Interspeech.2021-1208
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain. Speaker diarization was evaluated under two speech activity conditions (diarization from a reference speech activity vs. diarization from scratch) and 11 diverse domains. The domains span a range of recording conditions and interaction types, including read audio-books, meeting speech, clinical interviews, web videos, and, for the first time, conversational telephone speech. A total of 30 organizations (forming 21 teams) from industry and academia submitted 499 valid system outputs. The evaluation results indicate that speaker diarization has improved markedly since DIHARD I, particularly for two-party interactions, but that for many domains (e.g., web video) the problem remains far from solved.
引用
收藏
页码:3570 / 3574
页数:5
相关论文
共 50 条
  • [1] LEAP Submission for the Third DIHARD Diarization Challenge
    Singh, Prachi
    Varma, Rajat
    Krishnamohan, Venkat
    Chetupalli, Srikanth Raj
    Ganapathy, Sriram
    [J]. INTERSPEECH 2021, 2021, : 3545 - 3549
  • [2] LEAP Diarization System for the Second DIHARD Challenge
    Singh, Prachi
    Vardhan, Harsha M. A.
    Ganapathy, Sriram
    Kanagasundaram, Ahilan
    [J]. INTERSPEECH 2019, 2019, : 983 - 987
  • [3] BUT SYSTEM FOR THE SECOND DIHARD SPEECH DIARIZATION CHALLENGE
    Landini, Federico
    Wang, Shuai
    Diez, Mireia
    Burget, Lukas
    Matejka, Pavel
    Zmolikova, Katerina
    Mosner, Ladislav
    Silnova, Anna
    Plchot, Oldrich
    Novotny, Ondrej
    Zeinali, Hossein
    Rohdin, Johan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6529 - 6533
  • [4] BUT system for DIHARD Speech Diarization Challenge 2018
    Diez, Mireia
    Landini, Federico
    Burget, Lukas
    Rohdin, Johan
    Silnova, Anna
    Zmolikova, Katerina
    Novotny, Ondrej
    Vesely, Karel
    Glembek, Ondrej
    Plchot, Oldrich
    Mosner, Ladislav
    Matejka, Pavel
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2798 - 2802
  • [5] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. INTERSPEECH 2019, 2019, : 988 - 992
  • [6] Speaker Diarization with Enhancing Speech for the First DIHARD Challenge
    Sun, Lei
    Du, Jun
    Jiang, Chao
    Zhang, Xueyang
    He, Shan
    Yin, Bing
    Lee, Chin-Hui
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2793 - 2797
  • [7] The Second DIHARD Diarization Challenge: Dataset, task, and baselines
    Ryant, Neville
    Church, Kenneth
    Cieri, Christopher
    Cristia, Alejandrina
    Du, Jun
    Ganapathy, Sriram
    Liberman, Mark
    [J]. INTERSPEECH 2019, 2019, : 978 - 982
  • [8] INVESTIGATING DEEP NEURAL NETWORKS FOR SPEAKER DIARIZATION IN THE DIHARD CHALLENGE
    Himawan, Ivan
    Rahman, Md Hafizur
    Sridharan, Sridha
    Fookes, Clinton
    Kanagasundaram, Ahilan
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1029 - 1035
  • [9] An Analysis of Speaker Diarization Fusion Methods For The First DIHARD Challenge
    Yin, Bing
    Du, Jun
    Sun, Lei
    Zhang, Xueyang
    He, Shan
    Ling, Zhenhua
    Hu, Guoping
    Guo, Wu
    [J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1473 - 1477
  • [10] Speaker Diarization with Deep Speaker Embeddings for DIHARD Challenge II
    Novoselov, Sergey
    Gusev, Aleksei
    Ivanov, Artem
    Pekhovsky, Timur
    Shulipa, Andrey
    Avdeeva, Anastasia
    Gorlanov, Artem
    Kozlov, Alexandr
    [J]. INTERSPEECH 2019, 2019, : 1003 - 1007