PARTIAL AUC OPTIMIZATION BASED DEEP SPEAKER EMBEDDINGS WITH CLASS-CENTER LEARNING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION

被引：0

作者：

Bai, Zhongxin ^{[1
,2
]}

Zhang, Xiao-Lei ^{[1
,2
]}

Chen, Jingdong ^{[1
,2
]}

机构：

[1] Northwestern Polytech Univ, Ctr Intelligent Acoust & Immers Commun, Xian, Peoples R China

[2] Northwestern Polytech Univ, Sch Marine Sci & Technol, Xian, Peoples R China

来源：

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2020年

基金：

以色列科学基金会; 美国国家科学基金会;

关键词：

speaker verification; pAUC optimization; speaker centers; verification loss; RECOGNITION;

D O I：

10.1109/icassp40776.2020.9053674

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep embedding based text-independent speaker verification has demonstrated superior performance to traditional methods in many challenging scenarios. Its loss functions can be generally categorized into two classes, i.e., verification and identification. The verification loss functions match the pipeline of speaker verification, but their implementations are difficult. Thus, most state-of-the-art deep embedding methods use the identification loss functions with softmax output units or their variants. In this paper, we propose a verification loss function, named the maximization of partial area under the Receiver-operating-characteristic (ROC) curve (pAUC), for deep embedding based text-independent speaker verification. We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance. Experiments on the Speaker in the Wild (SITW) and NIST SRE 2016 datasets show that the proposed pAUC loss function is highly competitive with the state-of-the-art identification loss functions.

引用

页码：6819 / 6823

页数：5

共 50 条

[21] A Study on Angular Based Embedding Learning for Text-independent Speaker Verification
Chen, Zhiyong
Ren, Zongze
Xu, Shugong
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 445 - 449
[22] Text-Independent Speaker Verification Based on Triplet Loss
He, Junjie
He, Jing
Zhu, Liangjin
PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 2385 - 2388
[23] Unsupervised Speaker Adaptation based on the Cosine Similarity for Text-Independent Speaker Verification
Shum, Stephen
Dehak, Najim
Dehak, Reda
Glass, James R.
ODYSSEY 2010: THE SPEAKER AND LANGUAGE RECOGNITION WORKSHOP, 2010, : 76 - 82
[24] A novel text-independent speaker verification method based on the global speaker model
Zhang, YY
Zhang, D
Zhu, XY
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2000, 30 (05): : 598 - 602
[25] DeltaVLAD: An efficient optimization algorithm to discriminate speaker embedding for text-independent speaker verification
Guo, Xin
Luo, Chengfang
Deng, Aiwen
Deng, Feiqi
AIMS MATHEMATICS, 2022, 7 (04): : 6381 - 6395
[26] TEXT-INDEPENDENT SPEAKER VERIFICATION WITH ADVERSARIAL LEARNING ON SHORT UTTERANCES
Liu, Kai
Zhou, Huan
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6569 - 6573
[27] Graphical models for text-independent speaker verification
Sánchez-Soto, E
Sigelle, M
Chollet, G
NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 410 - 415
[28] Language dependency in text-independent speaker verification
Auckenthaler, R
Carey, MJ
Mason, JSD
2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 441 - 444
[29] Text-independent speaker verification in embedded environments
Tydlitat, Borivoj
Navratil, Jiri
Pelecanos, Jason W.
Ramaswamy, Ganesh N.
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 293 - +
[30] Adaptive method for text-independent speaker verification
Zhang, Yiying, 2000, (11):

← 1 2 3 4 5 →