THE LEAP SPEAKER RECOGNITION SYSTEM FOR NIST SRE 2018 CHALLENGE

被引：0

作者：

Ramoji, Shreyas ^{[1
]}

Mohan, Anand ^{[1
]}

Mysore, Bhargavram ^{[2
]}

Bhatia, Anmol ^{[3
]}

Singh, Prachi ^{[1
]}

Vardhan, Harsha ^{[1
]}

Ganapathy, Sriram ^{[1
]}

机构：

[1] Indian Inst Sci, Elect Engn, Learning & Extract Acoust Patterns LEAP Lab, Bengaluru, India

[2] North Carolina State Univ, Raleigh, NC USA

[3] Birla Inst Technol & Sci BITS Pilani, Pilani, Rajasthan, India

来源：

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2019年

关键词：

x-vectors; Speaker Diarization; PLDA scoring; Gaussian back-end; Dimensionality Reduction; Speaker Verification; SUPPORT VECTOR MACHINES; VERIFICATION; END;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The NIST Speaker Recognition Evaluation (SRE) 2018 challenge comprises an open evaluation of the text independent speaker verification task. This paper summarizes the LEAP speaker verification systems submitted to the NIST SRE 2018. For all the speaker verification approaches, the front-end feature extraction involved the use of neural embeddings from a time delay neural network (TDNN) trained on a speaker discrimination task. These features, called x vectors, are used in multiple ways for speaker verification task. In the first approach, the x-vectors with pre-processing and dimensionality reduction, are used with probabilistic linear discriminant analysis (PLDA) scoring. The second approach applies a speaker diarizanon scheme on the test segments containing multiple talkers before speaker verification scoring based on PLDA. The third system uses a local pairwise LDA model for pre-processing the x-vectors which are then scored using a Gaussian back-end. With experiments on the SRE 2018 database, we show that most of the systems achieved noticeable improvements over the NIST baseline in terms of the primary cost metric. Using a system fusion of the various approaches, we obtain significant improvements over the NIST official baseline (average relative improvements of 19.7% and 20.1% for the development and evaluation set respectively).

引用

下载

页码：5771 / 5775

页数：5

共 50 条

[31] The 2016 NIST Speaker Recognition Evaluation
Sadjadi, Seyed Omid
Kheyrkhah, Timothee
Tong, Audrey
Greenberg, Craig
Reynolds, Douglas
Singer, Elliot
Mason, Lisa
Hernandez-Cordero, Jaime
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1353 - 1357
[32] The NIST 2010 Speaker Recognition Evaluation
Martin, Alvin F.
Greenberg, Craig S.
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2734 - 2737
[33] The Relevance of NIST Speaker Recognition Evaluations
Asha, T.
Murthy, Hema A.
2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2014,
[34] The 2012 NIST Speaker Recognition Evaluation
Greenberg, Craig S.
Stanford, Vincent M.
Martin, Alvin F.
Yadagiri, Meghana
Doddington, George R.
Godfrey, John J.
Hernandez-Cordero, Jaime
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1970 - 1974
[35] AUT System for SITW Speaker Recognition Challenge
Khosravani, Abbas
Homayounpour, Mohammad Mehdi
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 843 - 847
[36] LIA system for the SITW Speaker Recognition Challenge
Ben Kheder, Waad
Ajili, Moez
Bousquet, Pierre-Michel
Matrouf, Driss
Bonastre, Tean-Frangois
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 848 - 852
[37] State-of-the-art Speaker Recognition for Telephone and Video Speech: the JHU-MIT Submission for NIST SRE18
Villalba, Jesus
Chen, Nanxin
Snyder, David
Garcia-Romero, Daniel
McCree, Alan
Sell, Gregory
Borgstrom, Jonas
Richardson, Fred
Shon, Suwon
Grondin, Francois
Dehak, Reda
Garcia-Perera, Leibny Paola
Povey, Daniel
Torres-Carrasquillo, Pedro A.
Khudanpur, Sanjeev
Dehak, Najim
INTERSPEECH 2019, 2019, : 1488 - 1492
[38] A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation
Ferrer, Luciana
McLaren, Mitchell
Scheffer, Nicolas
Lei, Yun
Graciarena, Martin
Mitra, Vikramjit
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1980 - 1984
[39] Development of the Primary CRIM System for the NIST 2008 Speaker Recognition Evaluation
Kenny, Patrick
Dehak, Najim
Ouellet, Pierre
Gupta, Vishwa
Dumouchel, Pierre
INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1401 - 1404
[40] THE 14U SYSTEM IN NIST 2008 SPEAKER RECOGNITION EVALUATION
Li, Haizhou
Ma, Bin
Lee, Kong-Aik
Sun, Hanwu
Zhu, Donglai
Sim, Khe Chai
You, Changhuai
Tong, Rong
Kaerkkaeinen, Ismo
Huang, Chien-Lin
Pervouchine, Vladimir
Guo, Wu
Li, Yijie
Dai, Lirong
Nosratighods, Mohaddeseh
Tharmarajah, Thiruvaran
Epps, Julien
Ambikairajah, Eliathamby
Chng, Eng-Siong
Schultz, Tanja
Jin, Qin
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 4201 - +

← 1 2 3 4 5 →