On Deep Speaker Embeddings for Speaker Verification

被引:3
|
作者
Jakubec, Maros [1 ]
Jarina, Roman [1 ]
Lieskovska, Eva [1 ]
Chmulik, Michal [1 ]
机构
[1] Univ Zilina, FEIT Fac Elect Engn & Informat Technol, Dept Multimedia & Informat Commun Technol, Univ 8215-1, Zilina 01026, Slovakia
关键词
x-vector; d-vector; i-vector; speaker verification; speaker embeddings; DNN;
D O I
10.1109/TSP52935.2021.9522589
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, there has been a tremendous application spike in the field of deep neural networks (DNN), including increasing interest in automatic speaker recognition systems development. Currently, the utilization of DNN-based speaker embeddings, such as x-vectors or d-vectors, is a common way of creating speaker-specific acoustic models. In recent years, these DNN embedings have begun to replace standard i-vectors extracted by factor analysis. We evaluated the performance and training time of the developed systems utilising these three state-of-the-art approaches. The results obtained on the VoxCeleb1 evaluation set show that x-vectors outperformed both conventional i-vectors and DNN-based d-vectors solutions, however at the cost of a higher computational load. We also show that the x-vector system with attentive pooling, AM-Softmax activation and PLDA back-end gives the lowest error rate over other architectures.
引用
收藏
页码:162 / 166
页数:5
相关论文
共 50 条
  • [41] Exploring Algorithmic Fairness in Deep Speaker Verification
    Fenu, Gianni
    Lafhouli, Hicham
    Marras, Mirko
    [J]. COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2020, PART IV, 2020, 12252 : 77 - 93
  • [42] Speaker verification
    Atkins, Wendy
    [J]. Biometric Technology Today, 2001, 9 (03) : 8 - 11
  • [43] MODELLING SPEAKER AND CHANNEL VARIABILITY USING DEEP NEURAL NETWORKS FOR ROBUST SPEAKER VERIFICATION
    Bhattacharya, Gautam
    Alam, Jahangir
    Kenny, Patrick
    Gupta, Vishwa
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 192 - 198
  • [44] Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?
    Wang, Qiongqiong
    Lee, Kong Aik
    Liu, Tianchi
    [J]. INTERSPEECH 2022, 2022, : 600 - 604
  • [45] Disentangling speaker and channel effects in speaker verification
    Kenny, P
    Dumouchel, P
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 37 - 40
  • [46] INVESTIGATION OF SPEAKER EMBEDDINGS FOR CROSS-SHOW SPEAKER DIARIZATION
    Rouvier, Mickael
    Favre, Benoit
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5585 - 5589
  • [47] Emotion Invariant Speaker Embeddings for Speaker Identification with Emotional Speech
    Sarma, Biswajit Dev
    Das, Rohan Kumar
    [J]. 2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 610 - 615
  • [48] Privacy-Preserving Speaker Verification using Secure Binary Embeddings
    Portelo, Jose
    Raj, Bhiksha
    Alberto, Abad
    Trancoso, Isabel
    [J]. 2014 37TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2014, : 1268 - 1272
  • [49] Privacy-preserving speaker verification using secure binary embeddings
    20143718152428
    [J]. (1) INESC-ID, Lisboa, Portugal; (2) Instituto Superior Técnico, Lisboa, Portugal; (3) Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA, United States, 1600, Ericsson Nikola Tesla Zagreb; et al.; HEP - Croatian Electricity Company Zagreb; InfoDom Zagreb; Koncar-Electrical Industries Zagreb; T-Croatian Telecom Zagreb (IEEE Computer Society):
  • [50] COMBINING DEEP EMBEDDINGS OF ACOUSTIC AND ARTICULATORY FEATURES FOR SPEAKER IDENTIFICATION
    Hong, Qian-Bei
    Wu, Chung-Hsien
    Wang, Hsin-Min
    Huang, Chien-Lin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7589 - 7593