LEAP Diarization System for the Second DIHARD Challenge

被引:9
|
作者
Singh, Prachi [1 ]
Vardhan, Harsha M. A. [1 ]
Ganapathy, Sriram [1 ]
Kanagasundaram, Ahilan [2 ]
机构
[1] Indian Inst Sci, Learning & Extract Acoust Patterns LEAP Lab, Bangalore, Karnataka, India
[2] Univ Jaffna, Jaffna, Sri Lanka
来源
关键词
Speaker Diarization; i-vector; x-vector; HMM-VB; PLDA; PLDA;
D O I
10.21437/Interspeech.2019-2716
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper presents the LEAP System, developed for the Second DIHARD diarization Challenge. The evaluation data in the challenge is composed of multi-talker speech in restaurants, doctor-patient conversations, child language acquisition recordings in home environments and audio extracted YouTube videos. The LEAP system is developed using two types of embeddings, one based on i-vector representations and the other one based on x-vector representations. The initial diarization output obtained using agglomerative hierarchical clustering (AHC) done on the probabilistic linear discriminant analysis (PLDA) scores is refined using the Variational-Bayes hidden Markov model (VB-HMM) model. We propose a modified VB-HMM model with posterior scaling which provides significant improvements in the final diarization error rate (DER). We also use a domain compensation on the i-vector features to reduce the mis-match between training and evaluation conditions.N(s)TN(s)TN(s)T Using the proposed approaches, we obtain relative improvements in DER of about 7.1% relative for the best individual system over the DIHARD baseline system and about 13.7% relative for the final system combination on evaluation set. An analysis performed using the proposed posterior scaling method shows that scaling results in improved discrimination among the HMM states in the VB-HMM.
引用
收藏
页码:983 / 987
页数:5
相关论文
共 50 条
  • [1] BUT SYSTEM FOR THE SECOND DIHARD SPEECH DIARIZATION CHALLENGE
    Landini, Federico
    Wang, Shuai
    Diez, Mireia
    Burget, Lukas
    Matejka, Pavel
    Zmolikova, Katerina
    Mosner, Ladislav
    Silnova, Anna
    Plchot, Oldrich
    Novotny, Ondrej
    Zeinali, Hossein
    Rohdin, Johan
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6529 - 6533
  • [2] LEAP Submission for the Third DIHARD Diarization Challenge
    Singh, Prachi
    Varma, Rajat
    Krishnamohan, Venkat
    Chetupalli, Srikanth Raj
    Ganapathy, Sriram
    [J]. INTERSPEECH 2021, 2021, : 3545 - 3549
  • [3] BUT system for DIHARD Speech Diarization Challenge 2018
    Diez, Mireia
    Landini, Federico
    Burget, Lukas
    Rohdin, Johan
    Silnova, Anna
    Zmolikova, Katerina
    Novotny, Ondrej
    Vesely, Karel
    Glembek, Ondrej
    Plchot, Oldrich
    Mosner, Ladislav
    Matejka, Pavel
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2798 - 2802
  • [4] The Second DIHARD Diarization Challenge: Dataset, task, and baselines
    Ryant, Neville
    Church, Kenneth
    Cieri, Christopher
    Cristia, Alejandrina
    Du, Jun
    Ganapathy, Sriram
    Liberman, Mark
    [J]. INTERSPEECH 2019, 2019, : 978 - 982
  • [5] ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge
    Vinals, Ignacio
    Gimeno, Pablo
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. INTERSPEECH 2019, 2019, : 988 - 992
  • [6] The Third DIHARD Diarization Challenge
    Ryant, Neville
    Singh, Prachi
    Krishnamohan, Venkat
    Varma, Rajat
    Church, Kenneth
    Cieri, Christopher
    Du, Jun
    Ganapathy, Sriram
    Liberman, Mark
    [J]. INTERSPEECH 2021, 2021, : 3570 - 3574
  • [7] ZCU-NTIS Speaker Diarization System for the DIHARD 2018 Challenge
    Zajic, Zbynek
    Kunesova, Marie
    Zelinka, Jan
    Hruz, Marek
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2788 - 2792
  • [8] Speaker Diarization with Enhancing Speech for the First DIHARD Challenge
    Sun, Lei
    Du, Jun
    Jiang, Chao
    Zhang, Xueyang
    He, Shan
    Yin, Bing
    Lee, Chin-Hui
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2793 - 2797
  • [9] UWB-NTIS Speaker Diarization System for the DIHARD II 2019 Challenge
    Zajic, Zbynek
    Kunesova, Marie
    Hruz, Marek
    Vanek, Jan
    [J]. INTERSPEECH 2019, 2019, : 993 - 997
  • [10] INVESTIGATING DEEP NEURAL NETWORKS FOR SPEAKER DIARIZATION IN THE DIHARD CHALLENGE
    Himawan, Ivan
    Rahman, Md Hafizur
    Sridharan, Sridha
    Fookes, Clinton
    Kanagasundaram, Ahilan
    [J]. 2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, : 1029 - 1035