The JHU Speaker Recognition System for the VOiCES 2019 Challenge

被引:24
|
作者
Snyder, David [1 ,2 ]
Villalba, Jesus [1 ]
Chen, Nanxin [1 ]
Povey, Daniel [1 ,2 ]
Sell, Gregory [2 ]
Dehak, Najim [1 ]
Khudanpur, Sanjeev [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
来源
关键词
speaker recognition; VOiCES Challenge 2019;
D O I
10.21437/Interspeech.2019-2979
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper describes the systems developed by the JHU team for the speaker recognition track of the 2019 VOiCES from a Distance Challenge. On this far-field task, we achieved good performance using systems based on state-of-the-art deep neural network (DNN) embeddings. In this paradigm, a DNN maps variable-length speech segments to speaker embeddings, called x-vectors, that are then classified using probabilistic linear discriminant analysis (PLDA). Our submissions were composed of three x-vector-based systems that differed primarily in the DNN architecture, temporal pooling mechanism, and training objective function. On the evaluation set, our best single-system submission used an extended time-delay architecture, and achieved 0.435 in actual DCF, the primary evaluation metric. A fusion of all three x-vector systems was our primary submission, and it obtained an actual DCF of 0.362.
引用
收藏
页码:2468 / 2472
页数:5
相关论文
共 50 条
  • [1] The JHU ASR System for VOiCES from a Distance Challenge 2019
    Wang, Yiming
    Snyder, David
    Xu, Hainan
    Manohar, Vimal
    Nidadavolu, Phani Sankar
    Povey, Daniel
    Khudanpur, Sanjeev
    INTERSPEECH 2019, 2019, : 2488 - 2492
  • [2] JHU-HLTCOE SYSTEM FOR THE VOXSRC SPEAKER RECOGNITION CHALLENGE
    Garcia-Romero, Daniel
    McCree, Alan
    Snyder, David
    Sell, Gregory
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7559 - 7563
  • [3] The DKU System for the Speaker Recognition Task of the 2019 VOiCES from a Distance Challenge
    Cai, Danwei
    Qin, Xiaoyi
    Cai, Weicheng
    Li, Ming
    INTERSPEECH 2019, 2019, : 2493 - 2497
  • [4] Intel Far-field Speaker Recognition System for VOiCES Challenge 2019
    Huang, Jonathan
    Bocklet, Tobias
    INTERSPEECH 2019, 2019, : 2473 - 2477
  • [5] THE UMD-JHU 2011 SPEAKER RECOGNITION SYSTEM
    Garcia-Romero, D.
    Zhou, X.
    Zotkin, D.
    Srinivasan, B.
    Luo, Y.
    Ganapathy, S.
    Thomas, S.
    Nemala, S.
    Sivaram, G. S. V. S.
    Mirbagheri, M.
    Mallidi, S. H.
    Janu, T.
    Rajan, P.
    Mesgarani, N.
    Elhilali, M.
    Hermansky, H.
    Shamma, S.
    Duraiswami, R.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4229 - 4232
  • [6] The I2R's Submission To VOiCES Distance Speaker Recognition Challenge 2019
    Sun, Hanwu
    Teh, Kah Kuan
    Kukanov, Ivan
    Huy Dat Tran
    INTERSPEECH 2019, 2019, : 2478 - 2482
  • [7] STC Speaker Recognition Systems for the VOiCES From a Distance Challenge
    Novoselov, Sergey
    Gusev, Aleksei
    Ivanov, Artem
    Pekhovsky, Timur
    Shulipa, Andrey
    Lavrentyeva, Galina
    Volokhov, Vladimir
    Kozlov, Alexandr
    INTERSPEECH 2019, 2019, : 2443 - 2447
  • [8] A Speaker Recognition System for the SITW Challenge
    Kudashev, Oleg
    Novoselov, Sergey
    Simonchik, Konstantin
    Kozlov, Alexandr
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 833 - 837
  • [9] The MIT-LL, JHU and LRDE NIST 2016 Speaker Recognition Evaluation System
    Torres-Carrasquillo, Pedro A.
    Richardson, Fred
    Nercessian, Shahan
    Sturim, Douglas
    Campbell, William
    Gwon, Youngjune
    Vattam, Swaroop
    Dehak, Najim
    Mallidi, Harish
    Nidadavolu, Phani Sankar
    Li, Ruizhi
    Dehak, Reda
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1333 - 1337
  • [10] AUT System for SITW Speaker Recognition Challenge
    Khosravani, Abbas
    Homayounpour, Mohammad Mehdi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 843 - 847