THE LEAP SPEAKER RECOGNITION SYSTEM FOR NIST SRE 2018 CHALLENGE

被引:0
|
作者
Ramoji, Shreyas [1 ]
Mohan, Anand [1 ]
Mysore, Bhargavram [2 ]
Bhatia, Anmol [3 ]
Singh, Prachi [1 ]
Vardhan, Harsha [1 ]
Ganapathy, Sriram [1 ]
机构
[1] Indian Inst Sci, Elect Engn, Learning & Extract Acoust Patterns LEAP Lab, Bengaluru, India
[2] North Carolina State Univ, Raleigh, NC USA
[3] Birla Inst Technol & Sci BITS Pilani, Pilani, Rajasthan, India
关键词
x-vectors; Speaker Diarization; PLDA scoring; Gaussian back-end; Dimensionality Reduction; Speaker Verification; SUPPORT VECTOR MACHINES; VERIFICATION; END;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The NIST Speaker Recognition Evaluation (SRE) 2018 challenge comprises an open evaluation of the text independent speaker verification task. This paper summarizes the LEAP speaker verification systems submitted to the NIST SRE 2018. For all the speaker verification approaches, the front-end feature extraction involved the use of neural embeddings from a time delay neural network (TDNN) trained on a speaker discrimination task. These features, called x vectors, are used in multiple ways for speaker verification task. In the first approach, the x-vectors with pre-processing and dimensionality reduction, are used with probabilistic linear discriminant analysis (PLDA) scoring. The second approach applies a speaker diarizanon scheme on the test segments containing multiple talkers before speaker verification scoring based on PLDA. The third system uses a local pairwise LDA model for pre-processing the x-vectors which are then scored using a Gaussian back-end. With experiments on the SRE 2018 database, we show that most of the systems achieved noticeable improvements over the NIST baseline in terms of the primary cost metric. Using a system fusion of the various approaches, we obtain significant improvements over the NIST official baseline (average relative improvements of 19.7% and 20.1% for the development and evaluation set respectively).
引用
下载
收藏
页码:5771 / 5775
页数:5
相关论文
共 50 条
  • [31] The 2016 NIST Speaker Recognition Evaluation
    Sadjadi, Seyed Omid
    Kheyrkhah, Timothee
    Tong, Audrey
    Greenberg, Craig
    Reynolds, Douglas
    Singer, Elliot
    Mason, Lisa
    Hernandez-Cordero, Jaime
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1353 - 1357
  • [32] The NIST 2010 Speaker Recognition Evaluation
    Martin, Alvin F.
    Greenberg, Craig S.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2734 - 2737
  • [33] The Relevance of NIST Speaker Recognition Evaluations
    Asha, T.
    Murthy, Hema A.
    2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2014,
  • [34] The 2012 NIST Speaker Recognition Evaluation
    Greenberg, Craig S.
    Stanford, Vincent M.
    Martin, Alvin F.
    Yadagiri, Meghana
    Doddington, George R.
    Godfrey, John J.
    Hernandez-Cordero, Jaime
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1970 - 1974
  • [35] AUT System for SITW Speaker Recognition Challenge
    Khosravani, Abbas
    Homayounpour, Mohammad Mehdi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 843 - 847
  • [36] LIA system for the SITW Speaker Recognition Challenge
    Ben Kheder, Waad
    Ajili, Moez
    Bousquet, Pierre-Michel
    Matrouf, Driss
    Bonastre, Tean-Frangois
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 848 - 852
  • [37] State-of-the-art Speaker Recognition for Telephone and Video Speech: the JHU-MIT Submission for NIST SRE18
    Villalba, Jesus
    Chen, Nanxin
    Snyder, David
    Garcia-Romero, Daniel
    McCree, Alan
    Sell, Gregory
    Borgstrom, Jonas
    Richardson, Fred
    Shon, Suwon
    Grondin, Francois
    Dehak, Reda
    Garcia-Perera, Leibny Paola
    Povey, Daniel
    Torres-Carrasquillo, Pedro A.
    Khudanpur, Sanjeev
    Dehak, Najim
    INTERSPEECH 2019, 2019, : 1488 - 1492
  • [38] A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation
    Ferrer, Luciana
    McLaren, Mitchell
    Scheffer, Nicolas
    Lei, Yun
    Graciarena, Martin
    Mitra, Vikramjit
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1980 - 1984
  • [39] Development of the Primary CRIM System for the NIST 2008 Speaker Recognition Evaluation
    Kenny, Patrick
    Dehak, Najim
    Ouellet, Pierre
    Gupta, Vishwa
    Dumouchel, Pierre
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1401 - 1404
  • [40] THE 14U SYSTEM IN NIST 2008 SPEAKER RECOGNITION EVALUATION
    Li, Haizhou
    Ma, Bin
    Lee, Kong-Aik
    Sun, Hanwu
    Zhu, Donglai
    Sim, Khe Chai
    You, Changhuai
    Tong, Rong
    Kaerkkaeinen, Ismo
    Huang, Chien-Lin
    Pervouchine, Vladimir
    Guo, Wu
    Li, Yijie
    Dai, Lirong
    Nosratighods, Mohaddeseh
    Tharmarajah, Thiruvaran
    Epps, Julien
    Ambikairajah, Eliathamby
    Chng, Eng-Siong
    Schultz, Tanja
    Jin, Qin
    2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4201 - +