The Relevance of NIST Speaker Recognition Evaluations

被引：0

作者：

Asha, T. ^{[1
]}

Murthy, Hema A. ^{[1
]}

机构：

[1] Indian Inst Technol, Madras 600036, Tamil Nadu, India

来源：

2014 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM) | 2014年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Feature extraction and building of the Universal Background Model (UBM) are crucial for building speaker verification/identification systems in the total variability subspace (TVS) framework. The motivation of this study is to analyze the significance of various parameters involved in front end processing for different databases. A number of different parameters like energy threshold for voice activity detection, the number of filters, the warping of the frequency scale, the number of cepstral coefficients and the shape of the filter are studied. Three different databases namely, NIST 2003, NisT 2010 and NTIMIT are studied. The optimal front-end obtained using NIST 2003 is observed to function well for NIST 2010 as conditions involving similar data was evaluated for both the databases. On the other hand, it is shown that the same optimal front-end is not scalable for NTIMIT database which is collected from a different environment. The experiments performed in this paper indicate that the optimal front-end parameters are specific to a particular dataset. In addition, mismatch between development data and evaluation data is shown to result in a poor system. Given the results, the paper questions the relevance of the NIST Speaker Recognition evaluations in real environments.

引用

下载

页数：6

共 50 条

[41] Comparison of Voice Activity Detectors for Interview Speech in NIST Speaker Recognition Evaluation
Yu, Hon-Bill
Mak, Man-Wai
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2364 - +
[42] The IIR NIST SRE 2008 and 2010 Summed Channel Speaker Recognition Systems
Sun, Hanwu
Ma, Bin
Huang, Chien-Lin
Trung Hieu Nguyen
Li, Haizhou
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 366 - 369
[43] HLT-NUS Submission for 2019 NIST Multimedia Speaker Recognition Evaluation
Das, Rohan Kumar
Tao, Ruijie
Yang, Jichen
Rao, Wei
Yu, Cheng
Li, Haizhou
2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 605 - 609
[44] Speaker diarization system on the 2007 NIST rich transcription meeting recognition evaluation
Sun, Hanwu
Nwe, Tin Lay
Chin, Eugene
Koh, Wei
Bin, Ma
Li, Haizhou
MULTIMEDIA SYSTEMS AND APPLICATIONS X, 2007, 6777
[45] Study of Overlapped Speech Detection for NIST SRE Summed Channel Speaker Recognition
Sun, Hanwu
Ma, Bin
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2356 - +
[46] THU-EE System Fusion for the NIST 2012 Speaker Recognition Evaluation
Zhang, Wei-Qiang
Li, Zhi-Yi
Liu, Weiwei
Liu, Jia
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2473 - 2477
[47] The 14U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016
Lee, K. A.
Hautamaki, V.
Kinnunen, T.
Larcher, A.
Zhang, C.
Nautsch, A.
Stafylakis, T.
Liu, G.
Rouvier, M.
Rao, W.
Alegre, F.
Ma, J.
Mak, M. W.
Sarkar, A. K.
Delgado, H.
Saeidi, R.
Aronowitz, H.
Sizov, A.
Sun, H.
Nguyen, T. H.
Wang, G.
Ma, B.
Vestman, V.
Sahidullah, M.
Halonen, M.
Kanervisto, A.
Le Lan, G.
Bahmaninezhad, F.
Isadskiy, S.
Rathgeb, C.
Busch, C.
Tzimiropoulos, G.
Qian, Q.
Wang, Z.
Zhao, Q.
Wang, T.
Li, H.
Xue, J.
Zhu, S.
Jin, R.
Zhao, T.
Bousquet, P. -M
Ajili, M.
Kheder, W. B.
Matrouf, D.
Lim, Z. H.
Xu, C.
Xu, H.
Xiao, X.
Chng, E. S.
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1328 - 1332
[48] NIST 2008 Speaker Recognition Evaluation: Performance Across Telephone and Room Microphone Channels
Martin, Alvin F.
Greenberg, Craig S.
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2539 - 2542
[49] Nuance - Politecnico di Torino's 2016 NIST Speaker Recognition Evaluation System
Colibro, Daniele
Vair, Claudio
Dalmasso, Emanuele
Farrell, Kevin
Karvitsky, Gennady
Cumani, Sandro
Laface, Pietro
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1338 - 1342
[50] The MIT-LL, JHU and LRDE NIST 2016 Speaker Recognition Evaluation System
Torres-Carrasquillo, Pedro A.
Richardson, Fred
Nercessian, Shahan
Sturim, Douglas
Campbell, William
Gwon, Youngjune
Vattam, Swaroop
Dehak, Najim
Mallidi, Harish
Nidadavolu, Phani Sankar
Li, Ruizhi
Dehak, Reda
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1333 - 1337

← 1 2 3 4 5 →