Deep Speaker Embedding with Frame-Constrained Training Strategy for Speaker Verification

被引:0
|
作者
Gu, Bin [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Speaker verification; loss function; local variation; frame-level features;
D O I
10.21437/Interspeech.2022-867
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech signals contain a lot of side information (content, stress, etc.), besides the voiceprint statistics. The session-variablility poses a huge challenge for modeling speaker characteristics. To alleviate this problem, we propose a novel frame-constrained training (FCT) strategy in this paper. It enhances the speaker information in frame-level layers for better embedding extraction. More precisely, a similarity matrix is calculated based on the frame-level features among each batch of the training samples, and a FCT loss is obtained through this similarity matrix. Finally, the speaker embedding network is trained by the combination of the FCT loss and the speaker classification loss. Experiments are performed on the VoxCeleb1 and VOiCES databases. The results demonstrate that the proposed training strategy boosts the system performance.
引用
收藏
页码:1451 / 1455
页数:5
相关论文
共 50 条
  • [1] CONSTRAINED DISCRIMINATIVE PLDA TRAINING FOR SPEAKER VERIFICATION
    Rohdin, Johan
    Biswas, Sangeeta
    Shinoda, Koichi
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] GAUSSIAN-CONSTRAINED TRAINING FOR SPEAKER VERIFICATION
    Li, Lantian
    Tang, Zhiyuan
    Shi, Ying
    Wang, Dong
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6036 - 6040
  • [3] DISENTANGLED SPEAKER EMBEDDING FOR ROBUST SPEAKER VERIFICATION
    Yi, Lu
    Mak, Man-Wai
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 7662 - 7666
  • [4] Learning Discriminative Speaker Embedding by Improving Aggregation Strategy and Loss Function for Speaker Verification
    Luo, Chengfang
    Guo, Xin
    Deng, Aiwen
    Xu, Wei
    Zhao, Junhong
    Kang, Wenxiong
    2021 INTERNATIONAL JOINT CONFERENCE ON BIOMETRICS (IJCB 2021), 2021,
  • [5] On Deep Speaker Embeddings for Speaker Verification
    Jakubec, Maros
    Jarina, Roman
    Lieskovska, Eva
    Chmulik, Michal
    2021 44TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2021, : 162 - 166
  • [6] An Effective Deep Embedding Learning Architecture for Speaker Verification
    Jiang, Yiheng
    Song, Yan
    McLoughlin, Ian
    Gao, Zhifu
    Dai, Lirong
    INTERSPEECH 2019, 2019, : 4040 - 4044
  • [7] Introducing phonetic information to speaker embedding for speaker verification
    Liu, Yi
    He, Liang
    Liu, Jia
    Johnson, Michael T.
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)
  • [8] Introducing phonetic information to speaker embedding for speaker verification
    Yi Liu
    Liang He
    Jia Liu
    Michael T. Johnson
    EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [9] Deep Speaker Embeddings for Speaker Verification of Children
    Abed, Mohammed Hamzah
    Sztaho, David
    TEXT, SPEECH, AND DIALOGUE, TSD 2024, PT II, 2024, 15049 : 58 - 69
  • [10] Disentangled Speaker and Nuisance Attribute Embedding for Robust Speaker Verification
    Kang, Woo Hyun
    Mun, Sung Hwan
    Han, Min Hyun
    Kim, Nam Soo
    IEEE ACCESS, 2020, 8 : 141838 - 141849