The Relevance of NIST Speaker Recognition Evaluations

被引:0
|
作者
Asha, T. [1 ]
Murthy, Hema A. [1 ]
机构
[1] Indian Inst Technol, Madras 600036, Tamil Nadu, India
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Feature extraction and building of the Universal Background Model (UBM) are crucial for building speaker verification/identification systems in the total variability subspace (TVS) framework. The motivation of this study is to analyze the significance of various parameters involved in front end processing for different databases. A number of different parameters like energy threshold for voice activity detection, the number of filters, the warping of the frequency scale, the number of cepstral coefficients and the shape of the filter are studied. Three different databases namely, NIST 2003, NisT 2010 and NTIMIT are studied. The optimal front-end obtained using NIST 2003 is observed to function well for NIST 2010 as conditions involving similar data was evaluated for both the databases. On the other hand, it is shown that the same optimal front-end is not scalable for NTIMIT database which is collected from a different environment. The experiments performed in this paper indicate that the optimal front-end parameters are specific to a particular dataset. In addition, mismatch between development data and evaluation data is shown to result in a poor system. Given the results, the paper questions the relevance of the NIST Speaker Recognition evaluations in real environments.
引用
下载
收藏
页数:6
相关论文
共 50 条
  • [41] Comparison of Voice Activity Detectors for Interview Speech in NIST Speaker Recognition Evaluation
    Yu, Hon-Bill
    Mak, Man-Wai
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2364 - +
  • [42] The IIR NIST SRE 2008 and 2010 Summed Channel Speaker Recognition Systems
    Sun, Hanwu
    Ma, Bin
    Huang, Chien-Lin
    Trung Hieu Nguyen
    Li, Haizhou
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 366 - 369
  • [43] HLT-NUS Submission for 2019 NIST Multimedia Speaker Recognition Evaluation
    Das, Rohan Kumar
    Tao, Ruijie
    Yang, Jichen
    Rao, Wei
    Yu, Cheng
    Li, Haizhou
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 605 - 609
  • [44] Speaker diarization system on the 2007 NIST rich transcription meeting recognition evaluation
    Sun, Hanwu
    Nwe, Tin Lay
    Chin, Eugene
    Koh, Wei
    Bin, Ma
    Li, Haizhou
    MULTIMEDIA SYSTEMS AND APPLICATIONS X, 2007, 6777
  • [45] Study of Overlapped Speech Detection for NIST SRE Summed Channel Speaker Recognition
    Sun, Hanwu
    Ma, Bin
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2356 - +
  • [46] THU-EE System Fusion for the NIST 2012 Speaker Recognition Evaluation
    Zhang, Wei-Qiang
    Li, Zhi-Yi
    Liu, Weiwei
    Liu, Jia
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2473 - 2477
  • [47] The 14U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016
    Lee, K. A.
    Hautamaki, V.
    Kinnunen, T.
    Larcher, A.
    Zhang, C.
    Nautsch, A.
    Stafylakis, T.
    Liu, G.
    Rouvier, M.
    Rao, W.
    Alegre, F.
    Ma, J.
    Mak, M. W.
    Sarkar, A. K.
    Delgado, H.
    Saeidi, R.
    Aronowitz, H.
    Sizov, A.
    Sun, H.
    Nguyen, T. H.
    Wang, G.
    Ma, B.
    Vestman, V.
    Sahidullah, M.
    Halonen, M.
    Kanervisto, A.
    Le Lan, G.
    Bahmaninezhad, F.
    Isadskiy, S.
    Rathgeb, C.
    Busch, C.
    Tzimiropoulos, G.
    Qian, Q.
    Wang, Z.
    Zhao, Q.
    Wang, T.
    Li, H.
    Xue, J.
    Zhu, S.
    Jin, R.
    Zhao, T.
    Bousquet, P. -M
    Ajili, M.
    Kheder, W. B.
    Matrouf, D.
    Lim, Z. H.
    Xu, C.
    Xu, H.
    Xiao, X.
    Chng, E. S.
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1328 - 1332
  • [48] NIST 2008 Speaker Recognition Evaluation: Performance Across Telephone and Room Microphone Channels
    Martin, Alvin F.
    Greenberg, Craig S.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2539 - 2542
  • [49] Nuance - Politecnico di Torino's 2016 NIST Speaker Recognition Evaluation System
    Colibro, Daniele
    Vair, Claudio
    Dalmasso, Emanuele
    Farrell, Kevin
    Karvitsky, Gennady
    Cumani, Sandro
    Laface, Pietro
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1338 - 1342
  • [50] The MIT-LL, JHU and LRDE NIST 2016 Speaker Recognition Evaluation System
    Torres-Carrasquillo, Pedro A.
    Richardson, Fred
    Nercessian, Shahan
    Sturim, Douglas
    Campbell, William
    Gwon, Youngjune
    Vattam, Swaroop
    Dehak, Najim
    Mallidi, Harish
    Nidadavolu, Phani Sankar
    Li, Ruizhi
    Dehak, Reda
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1333 - 1337