The JHU Speaker Recognition System for the VOiCES 2019 Challenge

被引:24
|
作者
Snyder, David [1 ,2 ]
Villalba, Jesus [1 ]
Chen, Nanxin [1 ]
Povey, Daniel [1 ,2 ]
Sell, Gregory [2 ]
Dehak, Najim [1 ]
Khudanpur, Sanjeev [1 ,2 ]
机构
[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA
来源
关键词
speaker recognition; VOiCES Challenge 2019;
D O I
10.21437/Interspeech.2019-2979
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
This paper describes the systems developed by the JHU team for the speaker recognition track of the 2019 VOiCES from a Distance Challenge. On this far-field task, we achieved good performance using systems based on state-of-the-art deep neural network (DNN) embeddings. In this paradigm, a DNN maps variable-length speech segments to speaker embeddings, called x-vectors, that are then classified using probabilistic linear discriminant analysis (PLDA). Our submissions were composed of three x-vector-based systems that differed primarily in the DNN architecture, temporal pooling mechanism, and training objective function. On the evaluation set, our best single-system submission used an extended time-delay architecture, and achieved 0.435 in actual DCF, the primary evaluation metric. A fusion of all three x-vector systems was our primary submission, and it obtained an actual DCF of 0.362.
引用
收藏
页码:2468 / 2472
页数:5
相关论文
共 50 条
  • [21] The VOiCES from a Distance Challenge 2019
    Nandwana, Mahesh Kumar
    van Hout, Julien
    Richey, Colleen
    McLaren, Mitchell
    Barrios, Maria A.
    Lawson, Aaron
    INTERSPEECH 2019, 2019, : 2438 - 2442
  • [22] The VoxCeleb Speaker Recognition Challenge: A Retrospective
    Huh, Jaesung
    Chung, Joon Son
    Nagrani, Arsha
    Brown, Andrew
    Jung, Jee-weon
    Garcia-Romero, Daniel
    Zisserman, Andrew
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3850 - 3866
  • [23] UWB-NTIS Speaker Diarization System for the DIHARD II 2019 Challenge
    Zajic, Zbynek
    Kunesova, Marie
    Hruz, Marek
    Vanek, Jan
    INTERSPEECH 2019, 2019, : 993 - 997
  • [24] The assessment of efficiency of the automatic speaker recognition system for voices registered using a throat microphone
    Kaminski, K.
    Dobrowolski, A. P.
    Taton, R.
    XII CONFERENCE ON RECONNAISSANCE AND ELECTRONIC WARFARE SYSTEMS, 2019, 11055
  • [25] Identifying perceptually similar voices with a speaker recognition system using auto-phonetic features
    Kelly, Finnian
    Alexander, Anil
    Forth, Oscar
    Kent, Samuel
    Lindh, Jonas
    Akesson, Joel
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1567 - 1568
  • [26] Measurement of the impact of identical twin voices on automatic speaker recognition
    Sabatier, Stallone B.
    Trester, Morgan R.
    Dawson, Jeremy M.
    MEASUREMENT, 2019, 134 : 385 - 389
  • [27] Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge
    Liu, Yi
    Tian, Yao
    He, Liang
    Liu, Jia
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 853 - 857
  • [28] AMRITATCS-IITGUWAHATI Combined System for the Speakers in the Wild (SITW) Speaker Recognition Challenge
    George, Kuruvachan K.
    Das, Rohan Kumar
    Jelil, Sarfaraz
    Das, K. Arun
    Kumar, C. Santhosh
    Prasanna, S. R. Mahadeva
    Panda, Ashish
    PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 2842 - 2846
  • [29] AN AUTOMATIC SPEAKER RECOGNITION SYSTEM
    Akrouf, Samir
    Mehamel, Abbas
    Benhamouda, Nacera
    Mostefai, Messaoud
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2009), VOLS 1 AND 2, 2009, : 719 - 727
  • [30] DESIGNING A SPEAKER RECOGNITION SYSTEM
    Ibrahim, Dogan
    Radwan, Maysa
    ELECTRONICS WORLD, 2011, 117 (1906): : 22 - 27