A Noise-Robust System for NIST 2012 Speaker Recognition Evaluation

被引:0
|
作者
Ferrer, Luciana [1 ]
McLaren, Mitchell [1 ]
Scheffer, Nicolas [1 ]
Lei, Yun [1 ]
Graciarena, Martin [1 ]
Mitra, Vikramjit [1 ]
机构
[1] SRI Int, Speech Technol & Res Lab, 333 Ravenswood Ave, Menlo Pk, CA 94025 USA
关键词
Speaker recognition; noise-robustness; PLDA; iVector; VERIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The National Institute of Standards and Technology (MST) 2012 speaker recognition evaluation posed several new challenges including noisy data, varying test-sample length and number of enrollment samples, and a new metric. Target speakers were known during system development and could be used for model training and score normalization. For the evaluation, SRI International (SRI) submitted a system consisting of six subsystems that use different low- and high-level features, some specifically designed for noise robustness, fused at the score and iVector levels. This paper presents SRI's submission along with a careful analysis of the approaches that provided gains for this challenging evaluation including a multiclass voice-activity detection system, the use of noisy data in system training, and the fusion of subsystems using acoustic characterization metadata.
引用
收藏
页码:1980 / 1984
页数:5
相关论文
共 50 条
  • [41] An Overview of Noise-Robust Automatic Speech Recognition
    Li, Jinyu
    Deng, Li
    Gong, Yifan
    Haeb-Umbach, Reinhold
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (04) : 745 - 777
  • [42] Frame decorrelation for noise-robust speech recognition
    Jung, HY
    Kim, DY
    Un, CK
    [J]. ELECTRONICS LETTERS, 1996, 32 (13) : 1163 - 1164
  • [43] Frame decorrelation for noise-robust speech recognition
    Korea Advanced Inst of Science and, Technology, Taejon, Korea, Republic of
    [J]. Electron Lett, 13 (1163-1164):
  • [44] Software Entity Recognition with Noise-Robust Learning
    Tai Nguyen
    Di, Yifeng
    Lee, Joohan
    Chen, Muhao
    Zhang, Tianyi
    [J]. 2023 38TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING, ASE, 2023, : 484 - 496
  • [45] Extended VTS for Noise-Robust Speech Recognition
    van Dalen, Rogier C.
    Gales, Mark J. F.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (04): : 733 - 743
  • [46] I-vector Transformation Using a Novel Discriminative Denoising Autoencoder for Noise-robust Speaker Recognition
    Mahto, Shivangi
    Yamamoto, Hitoshi
    Koshinaka, Takafumi
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3722 - 3726
  • [47] Speaker diarization system on the 2007 NIST rich transcription meeting recognition evaluation
    Sun, Hanwu
    Nwe, Tin Lay
    Chin, Eugene
    Koh, Wei
    Bin, Ma
    Li, Haizhou
    [J]. MULTIMEDIA SYSTEMS AND APPLICATIONS X, 2007, 6777
  • [48] Noise-Robust Algorithm of Speech Features Extraction for Automatic Speech Recognition System
    Yakhnev, A. N.
    Pisarev, A. S.
    [J]. PROCEEDINGS OF THE XIX IEEE INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND MEASUREMENTS (SCM 2016), 2016, : 206 - 208
  • [49] Research of a Non-Specific Person Noise-Robust Speech Recognition System
    Bai, Jing
    Zhang, Xueying
    [J]. 2009 5TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-8, 2009, : 2014 - 2017
  • [50] An engineering model of the masking for the noise-robust speech recognition
    Park, KY
    Lee, SY
    [J]. NEUROCOMPUTING, 2003, 52-4 : 615 - 620