The Relevance of NIST Speaker Recognition Evaluations

被引：0

作者：

Asha, T. ^{[1
]}

Murthy, Hema A. ^{[1
]}

机构：

[1] Indian Inst Technol, Madras 600036, Tamil Nadu, India

来源：

2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM) | 2014年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Feature extraction and building of the Universal Background Model (UBM) are crucial for building speaker verification/identification systems in the total variability subspace (TVS) framework. The motivation of this study is to analyze the significance of various parameters involved in front end processing for different databases. A number of different parameters like energy threshold for voice activity detection, the number of filters, the warping of the frequency scale, the number of cepstral coefficients and the shape of the filter are studied. Three different databases namely, NIST 2003, NisT 2010 and NTIMIT are studied. The optimal front-end obtained using NIST 2003 is observed to function well for NIST 2010 as conditions involving similar data was evaluated for both the databases. On the other hand, it is shown that the same optimal front-end is not scalable for NTIMIT database which is collected from a different environment. The experiments performed in this paper indicate that the optimal front-end parameters are specific to a particular dataset. In addition, mismatch between development data and evaluation data is shown to result in a poor system. Given the results, the paper questions the relevance of the NIST Speaker Recognition evaluations in real environments.

引用

下载

页数：6

共 50 条

[21] The NIST SRE Summed Channel Speaker Recognition System
Sun, Hanwu
Ma, Bin
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1111 - 1114
[22] Rapid channel compensation for speaker verification in the NIST 2000 speaker recognition evaluation
Pelecanos, J.
Sridharan, S.
Acoustics Australia, 2001, 29 (01) : 17 - 20
[23] State-of-the-art speaker recognition with neural network embeddings in NIST SRE18 and Speakers in the Wild evaluations
Villalba, Jesus
Chen, Nanxin
Snyder, David
Garcia-Romero, Daniel
McCree, Alan
Sell, Gregory
Borgstrom, Jonas
Garcia-Perera, Leibny Paola
Richardson, Fred
Dehak, Reda
Torres-Carrasquillo, Pedro A.
Dehak, Najim
COMPUTER SPEECH AND LANGUAGE, 2020, 60
[24] The ELISA consortium approaches in speaker segmentation during the NIST 2002 speaker recognition evaluation
Moraru, D
Meignier, S
Besacier, L
Bonastre, JF
Magrin-Chagnolleau, I
2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 89 - 92
[25] Performance Factor Analysis for the 2012 NIST Speaker Recognition Evaluation
Martin, Alvin F.
Greenberg, Craig S.
Stanford, Vincent M.
Howard, John M.
Doddington, George R.
Godfrey, John J.
15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1135 - 1138
[26] SRI's 2004 NIST speaker recognition evaluation system
Kajarekar, SS
Ferrer, L
Shriberg, E
Sonmez, K
Stolcke, A
Venkatarman, A
Zheng, J
2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 173 - 176
[27] Human Assisted Speaker Recognition In NIST SRE10
Greenberg, Craig
Martin, Alvin
Brandschain, Linda
Campbell, Joseph
Cieri, Christopher
Doddington, George
Godfrey, John
ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 180 - 185
[28] THE LEAP SPEAKER RECOGNITION SYSTEM FOR NIST SRE 2018 CHALLENGE
Ramoji, Shreyas
Mohan, Anand
Mysore, Bhargavram
Bhatia, Anmol
Singh, Prachi
Vardhan, Harsha
Ganapathy, Sriram
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5771 - 5775
[29] Report on Performance Results in the NIST 2010 Speaker Recognition Evaluation
Greenberg, Craig S.
Martin, Alvin F.
Barr, Bradford N.
Doddington, George R.
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 268 - +
[30] UTD-CRSS Systems for 2016 NIST Speaker Recognition Evaluation
Zhang, Chunlei
Bahmaninezhad, Fahimeh
Ranjan, Shivesh
Yu, Chengzhu
Shokouhi, Navid
Hansen, John H. L.
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1343 - 1347

← 1 2 3 4 5 →