Improving the Generalized Performance of Deep Embedding for Text-Independent Speaker Verification

被引:0
|
作者
Li, Rongjin
Li, Lin [1 ]
Hong, Qingyang [2 ]
Guo, Huiyang
Zhao, Miao
机构
[1] Xiamen Univ, Coll Elect Sci & Technol, Xiamen, Fujian, Peoples R China
[2] Xiamen Univ, Sch Informat Sci & Technol, Xiamen, Fujian, Peoples R China
关键词
speaker verification; deep neural network; x-vector; filler node; attention model;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose an effective approach to improve the generalized performance of the speaker embedding system based on a neural network. The deep embedding system, based on x-vector, can obtain a good discriminative capability between different speakers while in the training stage. However, the cross-entropy loss function and the one-hot label limit the generalization ability of this kind of systems. This paper proposes a new training architecture by adding one filler node in the output layer to represent the out-of-domain speakers. Moreover, we adopt the attention mechanism in the proposed generalized x-vector system. From the results on the NIST 2010 SRE, we find that the proposed system achieves improved performance for the evaluation of varying duration conditions. On the full length condition, our proposed system obtains a 41.5% and 12.1% relative improvement over the baseline system (i-vector) and the standard x-vector system in equal error rate (EER).
引用
收藏
页码:21 / 25
页数:5
相关论文
共 50 条
  • [1] Neural Embedding Extractors for Text-Independent Speaker Verification
    Alam, Jahangir
    Kang, Woohyun
    Fathan, Abderrahim
    [J]. SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 10 - 23
  • [2] On Metric-based Deep Embedding Learning for Text-Independent Speaker Verification
    Kashani, Hamidreza Baradaran
    Reza, Shaghayegh
    Rezaei, Iman Sarraf
    [J]. 2020 6TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2020,
  • [3] DEEP SPEAKER EMBEDDING LEARNING WITH MULTI-LEVEL POOLING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Tang, Yun
    Ding, Guohong
    Huang, Jing
    He, Xiaodong
    Zhou, Bowen
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6116 - 6120
  • [4] Deep Speaker Embedding with Long Short Term Centroid Learning for Text-independent Speaker Verification
    Peng, Junyi
    Gu, Rongzhi
    Zou, Yuexian
    [J]. INTERSPEECH 2020, 2020, : 3246 - 3250
  • [5] Deep Speaker Feature Learning for Text-independent Speaker Verification
    Li, Lantian
    Chen, Yixiang
    Shi, Zing
    Tang, Zhiyuan
    Wang, Dong
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1542 - 1546
  • [6] A Study on Angular Based Embedding Learning for Text-independent Speaker Verification
    Chen, Zhiyong
    Ren, Zongze
    Xu, Shugong
    [J]. 2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 445 - 449
  • [7] IMPROVING DEEP CNN NETWORKS WITH LONG TEMPORAL CONTEXT FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Zhao, Yong
    Zhou, Tianyan
    Chen, Zhuo
    Wu, Jian
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6834 - 6838
  • [8] A tutorial on text-independent speaker verification
    [J]. Bimbot, F. (bimbot@irisa.fr), 1600, Hindawi Publishing Corporation (2004):
  • [9] A tutorial on text-independent speaker verification
    Bimbot, F
    Bonastre, JF
    Fredouille, C
    Gravier, G
    Magrin-Chagnolleau, I
    Meignier, S
    Merlin, T
    Ortega-García, J
    Petrovska-Delacrétaz, D
    Reynolds, DA
    [J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) : 430 - 451
  • [10] DeltaVLAD: An efficient optimization algorithm to discriminate speaker embedding for text-independent speaker verification
    Guo, Xin
    Luo, Chengfang
    Deng, Aiwen
    Deng, Feiqi
    [J]. AIMS MATHEMATICS, 2022, 7 (04): : 6381 - 6395