On Deep Speaker Embeddings for Speaker Verification

被引：3

作者：

Jakubec, Maros ^{[1
]}

Jarina, Roman ^{[1
]}

Lieskovska, Eva ^{[1
]}

Chmulik, Michal ^{[1
]}

机构：

[1] Univ Zilina, FEIT Fac Elect Engn & Informat Technol, Dept Multimedia & Informat Commun Technol, Univ 8215-1, Zilina 01026, Slovakia

来源：

2021 44TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP) | 2021年

关键词：

x-vector; d-vector; i-vector; speaker verification; speaker embeddings; DNN;

D O I：

10.1109/TSP52935.2021.9522589

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In recent years, there has been a tremendous application spike in the field of deep neural networks (DNN), including increasing interest in automatic speaker recognition systems development. Currently, the utilization of DNN-based speaker embeddings, such as x-vectors or d-vectors, is a common way of creating speaker-specific acoustic models. In recent years, these DNN embedings have begun to replace standard i-vectors extracted by factor analysis. We evaluated the performance and training time of the developed systems utilising these three state-of-the-art approaches. The results obtained on the VoxCeleb1 evaluation set show that x-vectors outperformed both conventional i-vectors and DNN-based d-vectors solutions, however at the cost of a higher computational load. We also show that the x-vector system with attentive pooling, AM-Softmax activation and PLDA back-end gives the lowest error rate over other architectures.

引用

页码：162 / 166

页数：5

共 50 条

[1] Deep Speaker Embeddings for Speaker Verification of Children
Abed, Mohammed Hamzah
Sztaho, David
[J]. TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 58 - 69
[2] Deep Speaker Embeddings for Short-Duration Speaker Verification
Bhattacharya, Gautam
Alam, Jahangir
Kenny, Patrick
[J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1517 - 1521
[3] Deep speaker embeddings for Speaker Verification: Review and experimental comparison
Jakubec, Maros
Jarina, Roman
Lieskovska, Eva
Kasak, Peter
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
[4] Deep Discriminative Embeddings for Duration Robust Speaker Verification
Li, Na
Tuo, Deyi
Su, Dan
Li, Zhifeng
Yu, Dong
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2262 - 2266
[5] Lightweight Embeddings for Speaker Verification
Tkachenko, Maxim
Yamshinin, Alexander
Kotov, Mikhail
Nastasenko, Marina
[J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 687 - 696
[6] Shortcut Connections based Deep Speaker Embeddings for End-to-End Speaker Verification System
Seo, Soonshin
Rim, Daniel Jun
Lim, Minkyu
Lee, Donghyun
Park, Hosung
Oh, Junseok
Kim, Changmin
Kim, Ji-Hwan
[J]. INTERSPEECH 2019, 2019, : 2928 - 2932
[7] DEEP NEURAL NETWORK-BASED SPEAKER EMBEDDINGS FOR END-TO-END SPEAKER VERIFICATION
Snyder, David
Ghahremani, Pegah
Povey, Daniel
Garcia-Romero, Daniel
Carmiel, Yishay
Khudanpur, Sanjeev
[J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 165 - 170
[8] TEXT ADAPTATION FOR SPEAKER VERIFICATION WITH SPEAKER-TEXT FACTORIZED EMBEDDINGS
Yang, Yexin
Wang, Shuai
Gong, Xun
Qian, Yanmin
Yu, Kai
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6454 - 6458
[9] Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification
Bhattacharya, Gautam
Alam, Jahangir
Gupta, Vishwa
Kenny, Patrick
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3588 - 3592
[10] Speaker Diarization with Deep Speaker Embeddings for DIHARD Challenge II
Novoselov, Sergey
Gusev, Aleksei
Ivanov, Artem
Pekhovsky, Timur
Shulipa, Andrey
Avdeeva, Anastasia
Gorlanov, Artem
Kozlov, Alexandr
[J]. INTERSPEECH 2019, 2019, : 1003 - 1007

← 1 2 3 4 5 →