The Relevance of NIST Speaker Recognition Evaluations

被引:0
|
作者
Asha, T. [1 ]
Murthy, Hema A. [1 ]
机构
[1] Indian Inst Technol, Madras 600036, Tamil Nadu, India
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Feature extraction and building of the Universal Background Model (UBM) are crucial for building speaker verification/identification systems in the total variability subspace (TVS) framework. The motivation of this study is to analyze the significance of various parameters involved in front end processing for different databases. A number of different parameters like energy threshold for voice activity detection, the number of filters, the warping of the frequency scale, the number of cepstral coefficients and the shape of the filter are studied. Three different databases namely, NIST 2003, NisT 2010 and NTIMIT are studied. The optimal front-end obtained using NIST 2003 is observed to function well for NIST 2010 as conditions involving similar data was evaluated for both the databases. On the other hand, it is shown that the same optimal front-end is not scalable for NTIMIT database which is collected from a different environment. The experiments performed in this paper indicate that the optimal front-end parameters are specific to a particular dataset. In addition, mismatch between development data and evaluation data is shown to result in a poor system. Given the results, the paper questions the relevance of the NIST Speaker Recognition evaluations in real environments.
引用
下载
收藏
页数:6
相关论文
共 50 条
  • [1] NIST speech processing evaluations: LVCSR, speaker recognition, language recognition
    Martin, Alvin F.
    Garofolo, John S.
    2007 IEEE WORKSHOP ON SIGNAL PROCESSING APPLICATIONS FOR PUBLIC SECURITY AND FORENSICS, 2007, : 32 - +
  • [2] NIST and NFI-TNO evaluations of automatic speaker recognition
    van Leeuwen, DA
    Martin, AF
    Przybocki, MA
    Bouten, JS
    COMPUTER SPEECH AND LANGUAGE, 2006, 20 (2-3): : 128 - 158
  • [3] Evaluating Automatic Speaker Recognition systems: An overview of the NIST Speaker Recognition Evaluations (1996-2014)
    Gonzalez-Rodriguez, Joaquin
    LOQUENS, 2014, 1 (01):
  • [4] A study of voice activity detection techniques for NIST speaker recognition evaluations
    Mak, Man-Wai
    Yu, Hon-Bill
    COMPUTER SPEECH AND LANGUAGE, 2014, 28 (01): : 295 - 313
  • [5] NIST speaker recognition evaluations utilizing the Mixer Corpora - 2004, 2005, 2006
    Przybocki, Mark A.
    Martin, Alvin F.
    Le, Audrey N.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (07): : 1951 - 1959
  • [6] The 2016 NIST Speaker Recognition Evaluation
    Sadjadi, Seyed Omid
    Kheyrkhah, Timothee
    Tong, Audrey
    Greenberg, Craig
    Reynolds, Douglas
    Singer, Elliot
    Mason, Lisa
    Hernandez-Cordero, Jaime
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1353 - 1357
  • [7] The NIST 2010 Speaker Recognition Evaluation
    Martin, Alvin F.
    Greenberg, Craig S.
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2734 - 2737
  • [8] The 2018 NIST Speaker Recognition Evaluation
    Sadjadi, Seyed Omid
    Greenberg, Craig
    Singer, Elliot
    Reynolds, Douglas
    Mason, Lisa
    Hernandez-Cordero, Jaime
    INTERSPEECH 2019, 2019, : 1483 - 1487
  • [9] The 2012 NIST Speaker Recognition Evaluation
    Greenberg, Craig S.
    Stanford, Vincent M.
    Martin, Alvin F.
    Yadagiri, Meghana
    Doddington, George R.
    Godfrey, John J.
    Hernandez-Cordero, Jaime
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1970 - 1974
  • [10] The NIST 1999 Speaker Recognition Evaluation - An overview
    Martin, A
    Przybocki, M
    DIGITAL SIGNAL PROCESSING, 2000, 10 (1-3) : 1 - 18