Intra-class variation reduction of speaker representation in disentanglement framework

被引:11
|
作者
Kwo, Yoohwan [1 ]
Chun, Soo-Whan [1 ]
Kan, Hong-Goo [1 ]
机构
[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea
来源
基金
芬兰科学院;
关键词
speaker verification; disentanglement; mutual information;
D O I
10.21437/Interspeech.2020-2075
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we propose an effective training strategy to extract robust speaker representations from a speech signal. One of the key challenges in speaker recognition tasks is to learn latent representations or embeddings containing solely speaker characteristic information in order to be robust in terms of intra-speaker variations. By modifying the network architecture to generate both speaker-related and speaker-unrelated representations, we exploit a learning criterion which minimizes the mutual information between these disentangled embeddings. We also introduce an identity change loss criterion which utilizes a reconstruction error to different utterances spoken by the same speaker. Since the proposed criteria reduce the variation of speaker characteristics caused by changes in background environment or spoken content, the resulting embeddings of each speaker become more consistent. The effectiveness of the proposed method is demonstrated through two tasks; disentanglement performance, and improvement of speaker recognition accuracy compared to the baseline model on a benchmark dataset, VoxCeleb1. Ablation studies also show the impact of each criterion on overall performance.
引用
收藏
页码:3231 / 3235
页数:5
相关论文
共 50 条
  • [21] NONPARAMETRIC MEASURES OF INTRA-CLASS CORRELATION
    SHIRAHATA, S
    COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1982, 11 (15): : 1707 - 1721
  • [22] Intra-class variability in ATR systems
    Bhatnagar, R
    Dilsavor, R
    Minardi, M
    Pitts, D
    ALGORITHMS FOR SYNTHETIC APERTURE RADAR IMAGERY V, 1998, 3370 : 383 - 395
  • [23] On intra-class correlation coefficient estimation
    Pal, N
    Lim, WK
    STATISTICAL PAPERS, 2004, 45 (03) : 369 - 392
  • [24] DUAL-PATH FRAMEWORK FOR INTRA-CLASS IMBALANCE MEDICAL IMAGE SEGMENTATION
    Lin, Xiaolu
    Yang, Bing
    Zhou, Yifan
    Higashita, Risa
    Liu, Jiang
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [25] Linear representation of intra-class discriminant features for small-sample face recognition
    Shao, Changbin
    Gao, Shang
    Song, Xiaoning
    Yang, Xibei
    Xu, Gang
    JOURNAL OF ENGINEERING-JOE, 2018, (16): : 1668 - 1673
  • [26] Supervised Group Sparse Representation via Intra-class Low-Rank Constraint
    Kang, Peipei
    Fang, Xiaozhao
    Zhang, Wei
    Teng, Shaohua
    Fei, Lunke
    Xu, Yong
    Zheng, Yubao
    BIOMETRIC RECOGNITION, CCBR 2018, 2018, 10996 : 206 - 213
  • [27] Single-Sample Face Recognition Based on Intra-Class Differences in a Variation Model
    Cai, Jun
    Chen, Jing
    Liang, Xing
    SENSORS, 2015, 15 (01) : 1071 - 1087
  • [28] Abstraction and Generalization of 3D Structure for Recognition in Large Intra-Class Variation
    Somanath, Gowri
    Kambhamettu, Chandra
    COMPUTER VISION - ACCV 2010, PT III, 2011, 6494 : 483 - 496
  • [29] A COMPARISON OF ALPHA AND THE INTRA-CLASS RELIABILITY COEFFICIENTS
    JACKSON, A
    JACKSON, AS
    BELL, J
    RESEARCH QUARTERLY FOR EXERCISE AND SPORT, 1980, 51 (03) : 568 - 571
  • [30] Global-Local Framework for Medical Image Segmentation with Intra-class Imbalance Problem
    Zhou, Yifan
    Yang, Bing
    Lin, Xiaolu
    Higashita, Risa
    Liu, Jiang
    2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 366 - 370