An Effective Speaker Recognition Method Based on Joint Identification and Verification Supervisions

被引:5
|
作者
Liu, Ying [1 ]
Song, Yan [1 ]
Jiang, Yiheng [1 ]
McLoughlin, Ian [1 ,2 ]
Liu, Lin [3 ]
Dai, Lirong [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
[2] Singapore Inst Technol, ICT Cluster, Singapore, Singapore
[3] iFLYTEK CO LTD, iFLYTEK Res, Hefei 230088, Anhui, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
speaker verification; mutual information learning; attentive bilinear pooling; multi-task framework;
D O I
10.21437/Interspeech.2020-1922
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Deep embedding learning based speaker verification methods have attracted significant recent research interest due to their superior performance. Existing methods mainly focus on designing frame-level feature extraction structures, utterance-level aggregation methods and loss functions to learn discriminative speaker embeddings. The scores of verification trials are then computed using cosine distance or Probabilistic Linear Discriminative Analysis (PLDA) classifiers. This paper proposes an effective speaker recognition method which is based on joint identification and verification supervisions, inspired by multi-task learning frameworks. Specifically, a deep architecture with convolutional feature extractor, attentive pooling and two classifier branches is presented. The first, an identification branch, is trained with additive margin softmax loss (AM-Softmax) to classify the speaker identities. The second, a verification branch, trains a discriminator with binary cross entropy loss (BCE) to optimize a new triplet-based mutual information. To balance the two losses during different training stages, a ramp-up/ramp-down weighting scheme is employed. Furthermore, an attentive bilinear pooling method is proposed to improve the effectiveness of embeddings. Extensive experiments have been conducted on VoxCeleb1 to evaluate the proposed method, demonstrating results that relatively reduce the equal error rate (EER) by 22% compared to the baseline system using identification supervision only.
引用
收藏
页码:3007 / 3011
页数:5
相关论文
共 50 条
  • [1] AN EFFECTIVE IDENTIFICATION METHOD FOR SPEAKER RECOGNITION BASED ON PCA AND DOUBLE VQ
    Zhao, Zhen-Dong
    Zhang, Jing
    Tian, Jing-Feng
    Lou, Yun-Yong
    PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1686 - +
  • [2] A Speaker Identification system with verification method based on speaker relative threshold and HMM
    He, ZY
    Hu, QX
    2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 488 - 491
  • [3] SPECIAL SECTION ON AUTOMATIC SPEAKER RECOGNITION, IDENTIFICATION AND VERIFICATION
    BIMBOT, F
    CHOLLET, G
    PAOLOUI, A
    SPEECH COMMUNICATION, 1995, 17 (1-2) : 77 - 79
  • [4] Effective speaker adaptations for speaker verification
    Ahn, S
    Kang, S
    Ko, H
    2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1081 - 1084
  • [5] Discriminative Decision Function Based Scoring Method in Joint Factor Analysis for Speaker Verification
    Liang, Chunyan
    Zhang, Xiang
    Yang, Lin
    Yan, Yonghong
    13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1562 - 1565
  • [6] Speaker Recognition Based on the Joint Loss Function
    Feng, Tengteng
    Fan, Houbin
    Ge, Fengpei
    Cao, Shuxin
    Liang, Chunyan
    ELECTRONICS, 2023, 12 (16)
  • [7] Video Summarization Based on Face Recognition and Speaker Verification
    Lee, Yuan-Shan
    Hsu, Chia-Yung
    Lin, Po-Chuan
    Chen, Chia-Yen
    Wang, Jia-Ching
    PROCEEDINGS OF THE 2015 10TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, 2015, : 1815 - 1818
  • [8] A Speaker Verification Method Based on TDNN–LSTMP
    Hui Liu
    Longlian Zhao
    Circuits, Systems, and Signal Processing, 2019, 38 : 4840 - 4854
  • [9] METHOD OR SPEAKER VERIFICATION
    DODDINGTON, GR
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 49 (01): : 139 - +
  • [10] A New Speaker Verification Algorithm Based on Identification Results
    Khettaoui, Billal
    Dahimene, Abdelhakim
    2017 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING - BOUMERDES (ICEE-B), 2017,