An Effective Speaker Recognition Method Based on Joint Identification and Verification Supervisions

被引：5

作者：

Liu, Ying ^{[1
]}

Song, Yan ^{[1
]}

Jiang, Yiheng ^{[1
]}

McLoughlin, Ian ^{[1
,2
]}

Liu, Lin ^{[3
]}

Dai, Lirong ^{[1
]}

机构：

[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China

[2] Singapore Inst Technol, ICT Cluster, Singapore, Singapore

[3] iFLYTEK CO LTD, iFLYTEK Res, Hefei 230088, Anhui, Peoples R China

来源：

INTERSPEECH 2020 | 2020年

基金：

中国国家自然科学基金;

关键词：

speaker verification; mutual information learning; attentive bilinear pooling; multi-task framework;

D O I：

10.21437/Interspeech.2020-1922

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Deep embedding learning based speaker verification methods have attracted significant recent research interest due to their superior performance. Existing methods mainly focus on designing frame-level feature extraction structures, utterance-level aggregation methods and loss functions to learn discriminative speaker embeddings. The scores of verification trials are then computed using cosine distance or Probabilistic Linear Discriminative Analysis (PLDA) classifiers. This paper proposes an effective speaker recognition method which is based on joint identification and verification supervisions, inspired by multi-task learning frameworks. Specifically, a deep architecture with convolutional feature extractor, attentive pooling and two classifier branches is presented. The first, an identification branch, is trained with additive margin softmax loss (AM-Softmax) to classify the speaker identities. The second, a verification branch, trains a discriminator with binary cross entropy loss (BCE) to optimize a new triplet-based mutual information. To balance the two losses during different training stages, a ramp-up/ramp-down weighting scheme is employed. Furthermore, an attentive bilinear pooling method is proposed to improve the effectiveness of embeddings. Extensive experiments have been conducted on VoxCeleb1 to evaluate the proposed method, demonstrating results that relatively reduce the equal error rate (EER) by 22% compared to the baseline system using identification supervision only.

引用

页码：3007 / 3011

页数：5

共 50 条

[1] AN EFFECTIVE IDENTIFICATION METHOD FOR SPEAKER RECOGNITION BASED ON PCA AND DOUBLE VQ
Zhao, Zhen-Dong
Zhang, Jing
Tian, Jing-Feng
Lou, Yun-Yong
PROCEEDINGS OF 2009 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-6, 2009, : 1686 - +
[2] A Speaker Identification system with verification method based on speaker relative threshold and HMM
He, ZY
Hu, QX
2002 6TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS, VOLS I AND II, 2002, : 488 - 491
[3] SPECIAL SECTION ON AUTOMATIC SPEAKER RECOGNITION, IDENTIFICATION AND VERIFICATION
BIMBOT, F
CHOLLET, G
PAOLOUI, A
SPEECH COMMUNICATION, 1995, 17 (1-2) : 77 - 79
[4] Effective speaker adaptations for speaker verification
Ahn, S
Kang, S
Ko, H
2000 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, PROCEEDINGS, VOLS I-VI, 2000, : 1081 - 1084
[5] Discriminative Decision Function Based Scoring Method in Joint Factor Analysis for Speaker Verification
Liang, Chunyan
Zhang, Xiang
Yang, Lin
Yan, Yonghong
13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 1562 - 1565
[6] Speaker Recognition Based on the Joint Loss Function
Feng, Tengteng
Fan, Houbin
Ge, Fengpei
Cao, Shuxin
Liang, Chunyan
ELECTRONICS, 2023, 12 (16)
[7] Video Summarization Based on Face Recognition and Speaker Verification
Lee, Yuan-Shan
Hsu, Chia-Yung
Lin, Po-Chuan
Chen, Chia-Yen
Wang, Jia-Ching
PROCEEDINGS OF THE 2015 10TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, 2015, : 1815 - 1818
[8] A Speaker Verification Method Based on TDNN–LSTMP
Hui Liu
Longlian Zhao
Circuits, Systems, and Signal Processing, 2019, 38 : 4840 - 4854
[9] METHOD OR SPEAKER VERIFICATION
DODDINGTON, GR
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1971, 49 (01): : 139 - +
[10] A New Speaker Verification Algorithm Based on Identification Results
Khettaoui, Billal
Dahimene, Abdelhakim
2017 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING - BOUMERDES (ICEE-B), 2017,

← 1 2 3 4 5 →