On Deep Speaker Embeddings for Speaker Verification

被引:3
|
作者
Jakubec, Maros [1 ]
Jarina, Roman [1 ]
Lieskovska, Eva [1 ]
Chmulik, Michal [1 ]
机构
[1] Univ Zilina, FEIT Fac Elect Engn & Informat Technol, Dept Multimedia & Informat Commun Technol, Univ 8215-1, Zilina 01026, Slovakia
关键词
x-vector; d-vector; i-vector; speaker verification; speaker embeddings; DNN;
D O I
10.1109/TSP52935.2021.9522589
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, there has been a tremendous application spike in the field of deep neural networks (DNN), including increasing interest in automatic speaker recognition systems development. Currently, the utilization of DNN-based speaker embeddings, such as x-vectors or d-vectors, is a common way of creating speaker-specific acoustic models. In recent years, these DNN embedings have begun to replace standard i-vectors extracted by factor analysis. We evaluated the performance and training time of the developed systems utilising these three state-of-the-art approaches. The results obtained on the VoxCeleb1 evaluation set show that x-vectors outperformed both conventional i-vectors and DNN-based d-vectors solutions, however at the cost of a higher computational load. We also show that the x-vector system with attentive pooling, AM-Softmax activation and PLDA back-end gives the lowest error rate over other architectures.
引用
收藏
页码:162 / 166
页数:5
相关论文
共 50 条
  • [1] Deep Speaker Embeddings for Speaker Verification of Children
    Abed, Mohammed Hamzah
    Sztaho, David
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 58 - 69
  • [2] Deep Speaker Embeddings for Short-Duration Speaker Verification
    Bhattacharya, Gautam
    Alam, Jahangir
    Kenny, Patrick
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1517 - 1521
  • [3] Deep speaker embeddings for Speaker Verification: Review and experimental comparison
    Jakubec, Maros
    Jarina, Roman
    Lieskovska, Eva
    Kasak, Peter
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [4] Deep Discriminative Embeddings for Duration Robust Speaker Verification
    Li, Na
    Tuo, Deyi
    Su, Dan
    Li, Zhifeng
    Yu, Dong
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2262 - 2266
  • [5] Lightweight Embeddings for Speaker Verification
    Tkachenko, Maxim
    Yamshinin, Alexander
    Kotov, Mikhail
    Nastasenko, Marina
    [J]. SPEECH AND COMPUTER (SPECOM 2018), 2018, 11096 : 687 - 696
  • [6] Shortcut Connections based Deep Speaker Embeddings for End-to-End Speaker Verification System
    Seo, Soonshin
    Rim, Daniel Jun
    Lim, Minkyu
    Lee, Donghyun
    Park, Hosung
    Oh, Junseok
    Kim, Changmin
    Kim, Ji-Hwan
    [J]. INTERSPEECH 2019, 2019, : 2928 - 2932
  • [7] DEEP NEURAL NETWORK-BASED SPEAKER EMBEDDINGS FOR END-TO-END SPEAKER VERIFICATION
    Snyder, David
    Ghahremani, Pegah
    Povey, Daniel
    Garcia-Romero, Daniel
    Carmiel, Yishay
    Khudanpur, Sanjeev
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 165 - 170
  • [8] TEXT ADAPTATION FOR SPEAKER VERIFICATION WITH SPEAKER-TEXT FACTORIZED EMBEDDINGS
    Yang, Yexin
    Wang, Shuai
    Gong, Xun
    Qian, Yanmin
    Yu, Kai
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6454 - 6458
  • [9] Deeply Fused Speaker Embeddings for Text-Independent Speaker Verification
    Bhattacharya, Gautam
    Alam, Jahangir
    Gupta, Vishwa
    Kenny, Patrick
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3588 - 3592
  • [10] Speaker Diarization with Deep Speaker Embeddings for DIHARD Challenge II
    Novoselov, Sergey
    Gusev, Aleksei
    Ivanov, Artem
    Pekhovsky, Timur
    Shulipa, Andrey
    Avdeeva, Anastasia
    Gorlanov, Artem
    Kozlov, Alexandr
    [J]. INTERSPEECH 2019, 2019, : 1003 - 1007