LEAP Diarization System for the Second DIHARD Challenge

被引:9
|
作者
Singh, Prachi [1 ]
Vardhan, Harsha M. A. [1 ]
Ganapathy, Sriram [1 ]
Kanagasundaram, Ahilan [2 ]
机构
[1] Indian Inst Sci, Learning & Extract Acoust Patterns LEAP Lab, Bangalore, Karnataka, India
[2] Univ Jaffna, Jaffna, Sri Lanka
来源
关键词
Speaker Diarization; i-vector; x-vector; HMM-VB; PLDA; PLDA;
D O I
10.21437/Interspeech.2019-2716
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper presents the LEAP System, developed for the Second DIHARD diarization Challenge. The evaluation data in the challenge is composed of multi-talker speech in restaurants, doctor-patient conversations, child language acquisition recordings in home environments and audio extracted YouTube videos. The LEAP system is developed using two types of embeddings, one based on i-vector representations and the other one based on x-vector representations. The initial diarization output obtained using agglomerative hierarchical clustering (AHC) done on the probabilistic linear discriminant analysis (PLDA) scores is refined using the Variational-Bayes hidden Markov model (VB-HMM) model. We propose a modified VB-HMM model with posterior scaling which provides significant improvements in the final diarization error rate (DER). We also use a domain compensation on the i-vector features to reduce the mis-match between training and evaluation conditions.N(s)TN(s)TN(s)T Using the proposed approaches, we obtain relative improvements in DER of about 7.1% relative for the best individual system over the DIHARD baseline system and about 13.7% relative for the final system combination on evaluation set. An analysis performed using the proposed posterior scaling method shows that scaling results in improved discrimination among the HMM states in the VB-HMM.
引用
收藏
页码:983 / 987
页数:5
相关论文
共 50 条
  • [31] Speaker Diarization System based on DPCA Algorithm For Fearless Steps Challenge Phase-2
    Zhang, Xueshuai
    Wang, Wenchao
    Zhang, Pengyuan
    [J]. INTERSPEECH 2020, 2020, : 2602 - 2606
  • [32] VARIATIONAL BAYESIAN PLDA FOR SPEAKER DIARIZATION IN THE MGB CHALLENGE
    Villalba, Jesus
    Ortega, Alfonso
    Miguel, Antonio
    Lleida, Eduardo
    [J]. 2015 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING (ASRU), 2015, : 667 - 674
  • [33] SPEAKER EMBEDDINGS FOR DIARIZATION OF BROADCAST DATA IN THE ALLIES CHALLENGE
    Larcher, Anthony
    Mehrish, Ambuj
    Tahon, Marie
    Meignier, Sylvain
    Carrive, Jean
    Doukhan, David
    Galibert, Olivier
    Evans, Nicholas
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5799 - 5803
  • [34] THE IERS, THE LEAP SECOND, AND THE PUBLIC
    Dick, Wolfgang R.
    [J]. DECOUPLING CIVIL TIMEKEEPING FROM EARTH ROTATION, 2011, 113 : 117 - 122
  • [35] THE PHYSICAL BASIS OF THE LEAP SECOND
    McCarthy, Dennis D.
    Hackman, Christine
    Nelson, Robert A.
    [J]. ASTRONOMICAL JOURNAL, 2008, 136 (05): : 1906 - 1908
  • [36] An Improved Speaker Diarization System
    Fu, Rong
    Benest, Ian D.
    [J]. INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1253 - 1256
  • [37] Leap ahead into a year of challenge and change
    不详
    [J]. PROCESS ENGINEERING, 1996, 77 (01) : 3 - 3
  • [38] Leap-second decision delayed
    Gibney, Elizabeth
    [J]. NATURE, 2015, 527 (7579) : 421 - 422
  • [39] Time is running out for the leap second
    Zeeya Merali
    [J]. Nature, 2011, 479 : 158 - 158
  • [40] The big leap into our second century
    Iguchi, M
    [J]. JSME INTERNATIONAL JOURNAL SERIES C-MECHANICAL SYSTEMS MACHINE ELEMENTS AND MANUFACTURING, 1999, 42 (02) : 1 - 2