Disentangled Speaker and Nuisance Attribute Embedding for Robust Speaker Verification

被引:11
|
作者
Kang, Woo Hyun [2 ]
Mun, Sung Hwan [2 ]
Han, Min Hyun [2 ]
Kim, Nam Soo [1 ,2 ]
机构
[1] Seoul Natl Univ, Dept Elect & Comp Engn, Seoul 08826, South Korea
[2] Seoul Natl Univ, Seoul, South Korea
来源
IEEE ACCESS | 2020年 / 8卷
关键词
Training; Robustness; Performance evaluation; Law enforcement; Machine learning; Task analysis; Licenses; Speech embedding; speaker verification; domain disentanglement; deep learning; RECOGNITION;
D O I
10.1109/ACCESS.2020.3012893
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the recent years, various deep learning-based embedding methods have been proposed and have shown impressive performance in speaker verification. However, as in most of the classical embedding techniques, the deep learning-based methods are known to suffer from severe performance degradation when dealing with speech samples with different conditions (e.g., recording devices, emotional states). In this paper, we propose a novel fully supervised training method for extracting a speaker embedding vector disentangled from the variability caused by the nuisance attributes. The proposed framework was compared with the conventional deep learning-based embedding methods using the RSR2015 and VoxCeleb1 dataset. Experimental results show that the proposed approach can extract speaker embeddings robust to channel and emotional variability.
引用
收藏
页码:141838 / 141849
页数:12
相关论文
共 50 条
  • [31] A robust sparse auditory feature for speaker verification
    Han, J. (jqhan@hit.edu.cn), 1600, Binary Information Press, P.O. Box 162, Bethel, CT 06801-0162, United States (09):
  • [32] Robust Speaker Verification Against Additive Noise
    Wang, Ming-He
    Zhang, Er-Hua
    Tang, Zhen-Min
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2019, 35 (02) : 291 - 305
  • [33] Mismatch modeling and compensation for robust speaker verification
    Lei, Yun
    Hansen, John H. L.
    SPEECH COMMUNICATION, 2011, 53 (02) : 257 - 268
  • [34] Robust Speaker Verification with Principal Pitch Components
    Robert M. Nickel
    Sachin P. Oswal
    Ananth N. Iyer
    International Journal of Speech Technology, 2005, 8 (4) : 323 - 339
  • [35] LEARNABLE NONLINEAR COMPRESSION FOR ROBUST SPEAKER VERIFICATION
    Liu, Xuechen
    Sahidullah, Md
    Kinnunen, Tomi
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7962 - 7966
  • [36] Robust Speaker Verification with Principal Pitch Components
    Nickel, Robert M.
    Oswal, Sachin P.
    Iyer, Ananth N.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2005, 8 (04) : 323 - 339
  • [37] Attentive Feature Fusion for Robust Speaker Verification
    Liu, Bei
    Chen, Zhengyang
    Qian, Yanmin
    INTERSPEECH 2022, 2022, : 286 - 290
  • [38] A Robust SVM/GMM Classifier for Speaker Verification
    Cirovic, Zoran
    Cirovic, Natasa
    SPEECH AND COMPUTER, 2014, 8773 : 74 - 80
  • [39] Robust speaker verification in colored noise environment
    Medina, CA
    Apolinario, JA
    Alcaim, A
    Alves, RG
    CONFERENCE RECORD OF THE THIRTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1 AND 2, 2003, : 1890 - 1893
  • [40] DeltaVLAD: An efficient optimization algorithm to discriminate speaker embedding for text-independent speaker verification
    Guo, Xin
    Luo, Chengfang
    Deng, Aiwen
    Deng, Feiqi
    AIMS MATHEMATICS, 2022, 7 (04): : 6381 - 6395