Latent discriminative representation learning for speaker recognition

被引:2
|
作者
Huang, Duolin [1 ]
Mao, Qirong [1 ,2 ]
Ma, Zhongchen [1 ]
Zheng, Zhishen [1 ]
Routryar, Sidheswar [1 ]
Ocquaye, Elias-Nii-Noi [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Commun Engn, Zhenjiang 212013, Jiangsu, Peoples R China
[2] Jiangsu Key Lab Secur Technol Ind Cyberspace, Zhenjiang 212013, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Speaker recognition; Latent discriminative representation learning; Speaker embedding lookup table; Linear mapping matrix; TP391; 4; HIDDEN MARKOV-MODELS; FEATURE-EXTRACTION; IDENTIFICATION; FEATURES;
D O I
10.1631/FITEE.1900690
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Extracting discriminative speaker-specific representations from speech signals and transforming them into fixed length vectors are key steps in speaker identification and verification systems. In this study, we propose a latent discriminative representation learning method for speaker recognition. We mean that the learned representations in this study are not only discriminative but also relevant. Specifically, we introduce an additional speaker embedded lookup table to explore the relevance between different utterances from the same speaker. Moreover, a reconstruction constraint intended to learn a linear mapping matrix is introduced to make representation discriminative. Experimental results demonstrate that the proposed method outperforms state-of-the-art methods based on the Apollo dataset used in the Fearless Steps Challenge in INTERSPEECH2019 and the TIMIT dataset.
引用
收藏
页码:697 / 708
页数:12
相关论文
共 50 条
  • [1] Erratum to: Latent discriminative representation learning for speaker recognition
    Duolin Huang
    Qirong Mao
    Zhongchen Ma
    Zhishen Zheng
    Sidheswar Routray
    Elias-Nii-Noi Ocquaye
    [J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22 : 914 - 914
  • [2] Latent discriminative representation learning for speaker recognition (vol 22, pg 697, 2021)
    Huang, Duolin
    Mao, Qirong
    Ma, Zhongchen
    Zheng, Zhishen
    Routray, Sidheswar
    Ocquaye, Elias-Nii-Noi
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (06) : 914 - 914
  • [3] Latent discriminative representation learning for speaker recognition用于说话人识别的潜在可区分性表征学习
    Duolin Huang
    Qirong Mao
    Zhongchen Ma
    Zhishen Zheng
    Sidheswar Routryar
    Elias-Nii-Noi Ocquaye
    [J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22 : 697 - 708
  • [4] Learning robust latent representation for discriminative regression
    Cui, Jinrong
    Zhu, Qi
    Wang, Ding
    Li, Zuoyong
    [J]. PATTERN RECOGNITION LETTERS, 2019, 117 : 193 - 200
  • [5] MLP internal representation as discriminative features for improved speaker recognition
    Wu, DL
    Morris, A
    Koreman, J
    [J]. NONLINEAR ANALYSES AND ALGORITHMS FOR SPEECH PROCESSING, 2005, 3817 : 72 - 80
  • [6] Robust Deep Speaker Recognition: Learning Latent Representation with Joint Angular Margin Loss
    Chowdhury, Labib
    Zunair, Hasib
    Mohammed, Nabeel
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (21): : 1 - 17
  • [7] Discriminative Adversarial Learning for Speaker Independent Emotion Recognition
    Kasun, Chamara
    Ahn, Chung Soo
    Rajapakse, Jagath C.
    Lin, Zhiping
    Huang, Guang-Bin
    [J]. INTERSPEECH 2022, 2022, : 4975 - 4979
  • [8] Projective Representation Learning for Discriminative Face Recognition
    Zhong, Zuofeng
    Zhang, Zheng
    Xu, Yong
    [J]. COMPUTER VISION, PT II, 2017, 772 : 3 - 15
  • [9] Combining Speaker Recognition and Metric Learning for Speaker-Dependent Representation Learning
    Monteiro, Joao
    Alam, Jahangir
    Falk, Tiago H.
    [J]. INTERSPEECH 2019, 2019, : 4015 - 4019
  • [10] Exploring discriminative learning for text-independent speaker recognition
    Liu, Ming
    Zhang, Zhengyou
    Hasegawa-Johnson, Mark
    Huang, Thomas S.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-5, 2007, : 56 - 59