The JHU Speaker Recognition System for the VOiCES 2019 Challenge

被引：24

作者：

Snyder, David ^{[1
,2
]}

Villalba, Jesus ^{[1
]}

Chen, Nanxin ^{[1
]}

Povey, Daniel ^{[1
,2
]}

Sell, Gregory ^{[2
]}

Dehak, Najim ^{[1
]}

Khudanpur, Sanjeev ^{[1
,2
]}

机构：

[1] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

[2] Johns Hopkins Univ, Human Language Technol Ctr Excellence, Baltimore, MD 21218 USA

来源：

INTERSPEECH 2019 | 2019年

关键词：

speaker recognition; VOiCES Challenge 2019;

D O I：

10.21437/Interspeech.2019-2979

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

This paper describes the systems developed by the JHU team for the speaker recognition track of the 2019 VOiCES from a Distance Challenge. On this far-field task, we achieved good performance using systems based on state-of-the-art deep neural network (DNN) embeddings. In this paradigm, a DNN maps variable-length speech segments to speaker embeddings, called x-vectors, that are then classified using probabilistic linear discriminant analysis (PLDA). Our submissions were composed of three x-vector-based systems that differed primarily in the DNN architecture, temporal pooling mechanism, and training objective function. On the evaluation set, our best single-system submission used an extended time-delay architecture, and achieved 0.435 in actual DCF, the primary evaluation metric. A fusion of all three x-vector systems was our primary submission, and it obtained an actual DCF of 0.362.

引用

页码：2468 / 2472

页数：5

共 50 条

[21] The VOiCES from a Distance Challenge 2019
Nandwana, Mahesh Kumar
van Hout, Julien
Richey, Colleen
McLaren, Mitchell
Barrios, Maria A.
Lawson, Aaron
INTERSPEECH 2019, 2019, : 2438 - 2442
[22] The VoxCeleb Speaker Recognition Challenge: A Retrospective
Huh, Jaesung
Chung, Joon Son
Nagrani, Arsha
Brown, Andrew
Jung, Jee-weon
Garcia-Romero, Daniel
Zisserman, Andrew
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 3850 - 3866
[23] UWB-NTIS Speaker Diarization System for the DIHARD II 2019 Challenge
Zajic, Zbynek
Kunesova, Marie
Hruz, Marek
Vanek, Jan
INTERSPEECH 2019, 2019, : 993 - 997
[24] The assessment of efficiency of the automatic speaker recognition system for voices registered using a throat microphone
Kaminski, K.
Dobrowolski, A. P.
Taton, R.
XII CONFERENCE ON RECONNAISSANCE AND ELECTRONIC WARFARE SYSTEMS, 2019, 11055
[25] Identifying perceptually similar voices with a speaker recognition system using auto-phonetic features
Kelly, Finnian
Alexander, Anil
Forth, Oscar
Kent, Samuel
Lindh, Jonas
Akesson, Joel
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1567 - 1568
[26] Measurement of the impact of identical twin voices on automatic speaker recognition
Sabatier, Stallone B.
Trester, Morgan R.
Dawson, Jeremy M.
MEASUREMENT, 2019, 134 : 385 - 389
[27] Investigating Various Diarization Algorithms for Speaker in the Wild (SITW) Speaker Recognition Challenge
Liu, Yi
Tian, Yao
He, Liang
Liu, Jia
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 853 - 857
[28] AMRITATCS-IITGUWAHATI Combined System for the Speakers in the Wild (SITW) Speaker Recognition Challenge
George, Kuruvachan K.
Das, Rohan Kumar
Jelil, Sarfaraz
Das, K. Arun
Kumar, C. Santhosh
Prasanna, S. R. Mahadeva
Panda, Ashish
PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 2842 - 2846
[29] AN AUTOMATIC SPEAKER RECOGNITION SYSTEM
Akrouf, Samir
Mehamel, Abbas
Benhamouda, Nacera
Mostefai, Messaoud
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER THEORY AND ENGINEERING (ICACTE 2009), VOLS 1 AND 2, 2009, : 719 - 727
[30] DESIGNING A SPEAKER RECOGNITION SYSTEM
Ibrahim, Dogan
Radwan, Maysa
ELECTRONICS WORLD, 2011, 117 (1906): : 22 - 27

← 1 2 3 4 5 →