A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation

被引:0
|
作者
Ferrer, Luciana [1 ]
McLaren, Mitchell [1 ]
Scheffer, Nicolas [1 ]
Lei, Yun [1 ]
Graciarena, Martin [1 ]
Mitra, Vikramjit [1 ]
机构
[1] SRI Int, Speech Technol & Res Lab, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
关键词
Speaker recognition; noise-robustness; PLDA; iVector; VERIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The National Institute of Standards and Technology (MST) 2012 speaker recognition evaluation posed several new challenges including noisy data, varying test-sample length and number of enrollment samples, and a new metric. Target speakers were known during system development and could be used for model training and score normalization. For the evaluation, SRI International (SRI) submitted a system consisting of six subsystems that use different low- and high-level features, some specifically designed for noise robustness, fused at the score and iVector levels. This paper presents SRI's submission along with a careful analysis of the approaches that provided gains for this challenging evaluation including a multiclass voice-activity detection system, the use of noisy data in system training, and the fusion of subsystems using acoustic characterization metadata.
引用
收藏
页码:1980 / 1984
页数:5
相关论文
共 50 条
  • [1] The 2012 NIST Speaker Recognition Evaluation
    Greenberg, Craig S.
    Stanford, Vincent M.
    Martin, Alvin F.
    Yadagiri, Meghana
    Doddington, George R.
    Godfrey, John J.
    Hernandez-Cordero, Jaime
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1970 - 1974
  • [2] CRSS SYSTEMS FOR 2012 NIST SPEAKER RECOGNITION EVALUATION
    Hasan, Taufiq
    Sadjadi, Seyed Omid
    Liu, Gang
    Shokouhi, Navid
    Boril, Hynek
    Hansen, John H. L.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6783 - 6787
  • [3] Noise-Robust Speaker Recognition Based on Morphological Component Analysis
    He, Yongjun
    Chen, Chen
    Han, Jiqing
    [J]. 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 3001 - 3005
  • [4] Noise-robust feature based on sparse representation for speaker recognition
    Qi, Hongzhuo
    [J]. Metallurgical and Mining Industry, 2015, 7 (04): : 64 - 69
  • [5] THU-EE System Fusion for the NIST 2012 Speaker Recognition Evaluation
    Zhang, Wei-Qiang
    Li, Zhi-Yi
    Liu, Weiwei
    Liu, Jia
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2473 - 2477
  • [6] BUT system for NIST 2008 speaker recognition evaluation
    Burge, Lukas
    Fapso, Michal
    Hubeika, Valiantsina
    Glembek, Ondrej
    Karafiat, Martin
    Kockmann, Marcel
    Matejka, Pavel
    Schwartz, Petr
    Cernocky, Jan Honza
    [J]. INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2315 - 2318
  • [7] Performance Factor Analysis for the 2012 NIST Speaker Recognition Evaluation
    Martin, Alvin F.
    Greenberg, Craig S.
    Stanford, Vincent M.
    Howard, John M.
    Doddington, George R.
    Godfrey, John J.
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1135 - 1138
  • [8] Nuance - Politecnico di Torino's 2012 NIST Speaker Recognition Evaluation System
    Colibro, Daniele
    Vair, Claudio
    Farrell, Kevin
    Krause, Nir
    Karvitsky, Gennady
    Cumani, Sandro
    Laface, Pietro
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1995 - 1999
  • [9] I4U submission to NIST SRE 2012: A large-scale collaborative effort for noise-robust speaker verification
    Saeidi, R.
    Lee, K. A.
    Kinnunen, T.
    Hasan, T.
    Fauve, B.
    Bousquet, P-M.
    Khoury, E.
    Martinez, P. L. Sordo
    Kua, J. M. K.
    You, C. H.
    Sun, H.
    Larcher, A.
    Rajan, P.
    Hautamaki, V.
    Hanilci, C.
    Braithwaite, B.
    Gonzales-Hautamaki, R.
    Sadjadi, S. O.
    Liu, G.
    Boril, H.
    Shokouhi, N.
    Matrouf, D.
    El Shafey, L.
    Mowlaee, P.
    Epps, J.
    Thiruvaran, T.
    van Leeuwen, D. A.
    Ma, B.
    Li, H.
    Hansen, J. H. L.
    Bonastre, J-F.
    Marcel, S.
    Mason, J.
    Ambikairajah, E.
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 1985 - 1989
  • [10] THE HKCUPU SYSTEM FOR THE NIST 2010 SPEAKER RECOGNITION EVALUATION
    Jiang, Weiwu
    Mak, Man-Wai
    Rao, Wei
    Meng, Helen
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5288 - 5291