The Third DIHARD Diarization Challenge

被引：18

作者：

Ryant, Neville ^{[1
]}

Singh, Prachi ^{[2
]}

Krishnamohan, Venkat ^{[2
]}

Varma, Rajat ^{[2
]}

Church, Kenneth ^{[3
]}

Cieri, Christopher ^{[1
]}

Du, Jun ^{[4
]}

Ganapathy, Sriram ^{[2
]}

Liberman, Mark ^{[1
]}

机构：

[1] Univ Penn, Linguist Data Consortium, Philadelphia, PA 19104 USA

[2] Indian Inst Sci, LEAP Lab, Elect Engn, Bangalore, Karnataka, India

[3] Baidu Res, Sunnyvale, CA USA

[4] Univ Sci & Technol China, Hefei, Peoples R China

来源：

INTERSPEECH 2021 | 2021年

关键词：

speaker diarization; speaker recognition; robust ASR; noise; conversational speech; DIHARD challenge; SYSTEM; INDEX;

D O I：

10.21437/Interspeech.2021-1208

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

DIHARD III was the third in a series of speaker diarization challenges intended to improve the robustness of diarization systems to variability in recording equipment, noise conditions, and conversational domain. Speaker diarization was evaluated under two speech activity conditions (diarization from a reference speech activity vs. diarization from scratch) and 11 diverse domains. The domains span a range of recording conditions and interaction types, including read audio-books, meeting speech, clinical interviews, web videos, and, for the first time, conversational telephone speech. A total of 30 organizations (forming 21 teams) from industry and academia submitted 499 valid system outputs. The evaluation results indicate that speaker diarization has improved markedly since DIHARD I, particularly for two-party interactions, but that for many domains (e.g., web video) the problem remains far from solved.

引用

页码：3570 / 3574

页数：5

共 50 条

[1] LEAP Submission for the Third DIHARD Diarization Challenge
Singh, Prachi
Varma, Rajat
Krishnamohan, Venkat
Chetupalli, Srikanth Raj
Ganapathy, Sriram
[J]. INTERSPEECH 2021, 2021, : 3545 - 3549
[2] LEAP Diarization System for the Second DIHARD Challenge
Singh, Prachi
Vardhan, Harsha M. A.
Ganapathy, Sriram
Kanagasundaram, Ahilan
[J]. INTERSPEECH 2019, 2019, : 983 - 987
[3] BUT SYSTEM FOR THE SECOND DIHARD SPEECH DIARIZATION CHALLENGE
Landini, Federico
Wang, Shuai
Diez, Mireia
Burget, Lukas
Matejka, Pavel
Zmolikova, Katerina
Mosner, Ladislav
Silnova, Anna
Plchot, Oldrich
Novotny, Ondrej
Zeinali, Hossein
Rohdin, Johan
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6529 - 6533
[4] BUT system for DIHARD Speech Diarization Challenge 2018
Diez, Mireia
Landini, Federico
Burget, Lukas
Rohdin, Johan
Silnova, Anna
Zmolikova, Katerina
Novotny, Ondrej
Vesely, Karel
Glembek, Ondrej
Plchot, Oldrich
Mosner, Ladislav
Matejka, Pavel
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2798 - 2802
[5] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
Vinals, Ignacio
Gimeno, Pablo
Ortega, Alfonso
Miguel, Antonio
Lleida, Eduardo
[J]. INTERSPEECH 2019, 2019, : 988 - 992
[6] Speaker Diarization with Enhancing Speech for the First DIHARD Challenge
Sun, Lei
Du, Jun
Jiang, Chao
Zhang, Xueyang
He, Shan
Yin, Bing
Lee, Chin-Hui
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2793 - 2797
[7] The Second DIHARD Diarization Challenge: Dataset, task, and baselines
Ryant, Neville
Church, Kenneth
Cieri, Christopher
Cristia, Alejandrina
Du, Jun
Ganapathy, Sriram
Liberman, Mark
[J]. INTERSPEECH 2019, 2019, : 978 - 982
[8] INVESTIGATING DEEP NEURAL NETWORKS FOR SPEAKER DIARIZATION IN THE DIHARD CHALLENGE
Himawan, Ivan
Rahman, Md Hafizur
Sridharan, Sridha
Fookes, Clinton
Kanagasundaram, Ahilan
[J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1029 - 1035
[9] An Analysis of Speaker Diarization Fusion Methods For The First DIHARD Challenge
Yin, Bing
Du, Jun
Sun, Lei
Zhang, Xueyang
He, Shan
Ling, Zhenhua
Hu, Guoping
Guo, Wu
[J]. 2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 1473 - 1477
[10] Speaker Diarization with Deep Speaker Embeddings for DIHARD Challenge II
Novoselov, Sergey
Gusev, Aleksei
Ivanov, Artem
Pekhovsky, Timur
Shulipa, Andrey
Avdeeva, Anastasia
Gorlanov, Artem
Kozlov, Alexandr
[J]. INTERSPEECH 2019, 2019, : 1003 - 1007

← 1 2 3 4 5 →