The JHU Speaker Recognition System for the VOiCES 2019 Challenge

被引:24
|
作者
Snyder, David [1 ,2 ]
Villalba, Jesus [1 ]
Chen, Nanxin [1 ]
Povey, Daniel [1 ,2 ]
Sell, Gregory [2 ]
Dehak, Najim [1 ]
Khudanpur, Sanjeev [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
来源
关键词
speaker recognition; VOiCES Challenge 2019;
D O I
10.21437/Interspeech.2019-2979
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper describes the systems developed by the JHU team for the speaker recognition track of the 2019 VOiCES from a Distance Challenge. On this far-field task, we achieved good performance using systems based on state-of-the-art deep neural network (DNN) embeddings. In this paradigm, a DNN maps variable-length speech segments to speaker embeddings, called x-vectors, that are then classified using probabilistic linear discriminant analysis (PLDA). Our submissions were composed of three x-vector-based systems that differed primarily in the DNN architecture, temporal pooling mechanism, and training objective function. On the evaluation set, our best single-system submission used an extended time-delay architecture, and achieved 0.435 in actual DCF, the primary evaluation metric. A fusion of all three x-vector systems was our primary submission, and it obtained an actual DCF of 0.362.
引用
收藏
页码:2468 / 2472
页数:5
相关论文
共 50 条
  • [31] An automatic Speaker recognition system
    Chakraborty, P.
    Ahmed, F.
    Kabir, Md. Monirul
    Shahjahan, Md.
    Murase, Kazuyuki
    NEURAL INFORMATION PROCESSING, PART I, 2008, 4984 : 517 - +
  • [32] A novel speaker clustering algorithm in speaker recognition system
    Wang, Bo
    Zhao, Jing
    Peng, Xuan
    Li, Bi-Cheng
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2006, : 3298 - +
  • [33] The Unconstrained Ear Recognition Challenge 2019
    Emersic, Z.
    Kumar, A. S. V.
    Harish, B. S.
    Gutfeter, W.
    Khiarak, J. N.
    Pacut, A.
    Hansley, E.
    Segundo, M. Pamplona
    Sarkar, S.
    Park, H. J.
    Nam, G. P.
    Kim, I. -J.
    Sangodkar, S. G.
    Kacar, U.
    Kirci, M.
    Yuan, L.
    Yuan, J.
    Zhao, H.
    Lu, F.
    Mao, J.
    Zhang, X.
    Yaman, D.
    Eyiokur, F. I.
    Ozler, K. B.
    Ekenel, H. K.
    Chowdhury, D. Paul
    Bakshi, S.
    Sa, P. K.
    Majhi, B.
    Peer, P.
    Struc, V.
    2019 INTERNATIONAL CONFERENCE ON BIOMETRICS (ICB), 2019,
  • [34] Using prosodic and conversational features for high-performance speaker recognition: Report from JHU WS'02
    Peskin, B
    Navratil, J
    Abramson, J
    Jones, D
    Klusacek, D
    Reynolds, DA
    Xiang, B
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PROCEEDINGS: SIGNAL PROCESSING FOR COMMUNICATIONS SPECIAL SESSIONS, 2003, : 792 - 795
  • [35] The I2R's ASR System for the VOiCES from a Distance Challenge 2019
    Chong, Tze Yuang
    Tan, Kye Min
    Teh, Kah Kuan
    You, Changhuai
    Sun, Hanwu
    Tran, Huy Dat
    INTERSPEECH 2019, 2019, : 2458 - 2462
  • [36] Standoff Speaker Recognition: Effects of Recording Distance Mismatch on Speaker Recognition System Performance
    Fowler, Mike
    McCurry, Mark
    Bramsen, Jonathan
    Dunsin, Kehinde
    Remus, Jeremiah
    14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3680 - 3683
  • [37] Digital Speech Watermarking for Authenticity of Speaker in Speaker Recognition System
    Desai, Nihalkumar
    Tahilramani, Nikunj
    2016 INTERNATIONAL CONFERENCE ON MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING (ICMETE), 2016, : 105 - 109
  • [38] A SVM/HMM system for speaker recognition
    Campbell, WM
    2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PROCEEDINGS: SPEECH II; INDUSTRY TECHNOLOGY TRACKS; DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS; NEURAL NETWORKS FOR SIGNAL PROCESSING, 2003, : 209 - 212
  • [39] An Isolated Word Speaker Recognition System
    Ozaydin, Selma
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL AND COMPUTING TECHNOLOGIES AND APPLICATIONS (ICECTA), 2017, : 70 - 74
  • [40] A speaker recognition system based on VQ
    Zhao Yanling
    Zheng Xiaoshi
    Gao Huixian
    Li Na
    ICIEA 2008: 3RD IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, PROCEEDINGS, VOLS 1-3, 2008, : 1988 - 1990