Intra-class variation reduction of speaker representation in disentanglement framework

被引:11
|
作者
Kwo, Yoohwan [1 ]
Chun, Soo-Whan [1 ]
Kan, Hong-Goo [1 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea
来源
基金
芬兰科学院;
关键词
speaker verification; disentanglement; mutual information;
D O I
10.21437/Interspeech.2020-2075
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we propose an effective training strategy to extract robust speaker representations from a speech signal. One of the key challenges in speaker recognition tasks is to learn latent representations or embeddings containing solely speaker characteristic information in order to be robust in terms of intra-speaker variations. By modifying the network architecture to generate both speaker-related and speaker-unrelated representations, we exploit a learning criterion which minimizes the mutual information between these disentangled embeddings. We also introduce an identity change loss criterion which utilizes a reconstruction error to different utterances spoken by the same speaker. Since the proposed criteria reduce the variation of speaker characteristics caused by changes in background environment or spoken content, the resulting embeddings of each speaker become more consistent. The effectiveness of the proposed method is demonstrated through two tasks; disentanglement performance, and improvement of speaker recognition accuracy compared to the baseline model on a benchmark dataset, VoxCeleb1. Ablation studies also show the impact of each criterion on overall performance.
引用
收藏
页码:3231 / 3235
页数:5
相关论文
共 50 条
  • [41] THE LARGE SAMPLE VARIANCE OF AN INTRA-CLASS CORRELATION
    DONNER, A
    KOVAL, JJ
    BIOMETRIKA, 1980, 67 (03) : 719 - 722
  • [42] The Intra-Class and Inter-Class Relationships in Style Transfer
    Cui, Xin
    Qi, Meng
    Niu, Yi
    Li, Bingxin
    APPLIED SCIENCES-BASEL, 2018, 8 (09):
  • [43] Measuring fMRI reliability with the intra-class correlation coefficient
    Caceres, Alejandro
    Hall, Deanna L.
    Zelaya, Fernando O.
    Williams, Steven C. R.
    Mehta, Mitul A.
    NEUROIMAGE, 2009, 45 (03) : 758 - 768
  • [44] Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation
    Noe, Paul-Gauthier
    Mohammadamini, Mohammad
    Matrouf, Driss
    Parcollet, Titouan
    Nautsch, Andreas
    Bonastre, Jean-Francois
    INTERSPEECH 2021, 2021, : 1902 - 1906
  • [45] Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery
    Lin, Wenbin
    Fan, Zhichen
    Huo, Jing
    Gao, Yang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3449 - 3458
  • [46] TESTING FOR INDEPENDENCE IN INTRA-CLASS CONTINGENCY-TABLES
    HABER, M
    BIOMETRICS, 1982, 38 (01) : 93 - 103
  • [47] VARIOUS INTRA-CLASS CORRELATION RELIABILITY COEFFICIENTS - REPLY
    BARTKO, JJ
    PSYCHOLOGICAL BULLETIN, 1978, 85 (01) : 139 - 140
  • [48] Letter: Toward Intra-Class Switching With JAK Inhibitors?
    Uzzan, Mathieu
    Laharie, David
    ALIMENTARY PHARMACOLOGY & THERAPEUTICS, 2025, 61 (05) : 919 - 920
  • [49] Sibling Models, Categorical Outcomes, and the Intra-Class Correlation
    Breen, Richard
    Ermisch, John
    EUROPEAN SOCIOLOGICAL REVIEW, 2021, 37 (03) : 497 - 504
  • [50] Self-Distillation via Intra-Class Compactness
    Lin, Jiaye
    Li, Lin
    Yu, Baosheng
    Ou, Weihua
    Gou, Jianping
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT 1, 2025, 15031 : 139 - 151