I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification

被引：9

作者：

Zhang, Jiacen ^{[1
]}

Inoue, Nakamasa ^{[1
]}

Shinoda, Koichi ^{[1
]}

机构：

[1] Tokyo Inst Technol, Tokyo, Japan

来源：

19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年

关键词：

speaker verification; short utterance; i-vector transformation; generative adversarial networks; multi-task learning;

D O I：

10.21437/Interspeech.2018-1680

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

I-vector based text-independent speaker verification (SV) systems often have poor performance with short utterances, as the biased phonetic distribution in a short utterance makes the extracted i-vector unreliable. This paper proposes an i-vector compensation method using a generative adversarial network (GAN), where its generator network is trained to generate a compensated i-vector from a short-utterance i-vector and its discriminator network is trained to determine whether an i-vector is generated by the generator or the one extracted from a long utterance. Additionally, we assign two other learning tasks to the GAN to stabilize its training and to make the generated i-vector more speaker-specific. Speaker verification experiments on the NIST SRE 2008 "10sec-10sec" condition show that after applying our method, the equal error rate reduced by 11.3% from the conventional i-vector and PLDA system.

引用

页码：3613 / 3617

页数：5

共 50 条

[1] An Adaptive i-Vector Extraction for Speaker Verification with Short Utterance
Poddar, Arnab
Sahidullah, Md
Saha, Goutam
[J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 326 - 332
[2] Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques
Kanagasundaram, A.
Dean, D.
Sridharan, S.
Gonzalez-Dominguez, J.
Gonzalez-Rodriguez, J.
Ramos, D.
[J]. SPEECH COMMUNICATION, 2014, 59 : 69 - 82
[3] Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification
Sarkar, A. K.
Matrouf, D.
Bousquet, P. M.
Bonastre, J. F.
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2661 - 2664
[4] Boosting the Performance of I-Vector Based Speaker Verification via Utterance Partitioning
Rao, Wei
Mak, Man-Wai
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (05): : 1012 - 1022
[5] ADVERSARIAL ATTACKS ON GMM I-VECTOR BASED SPEAKER VERIFICATION SYSTEMS
Li, Xu
Zhong, Jinghua
Wu, Xixin
Yu, Jianwei
Liu, Xunying
Meng, Helen
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6579 - 6583
[6] I-VECTOR TRANSFORMATION USING K-NEAREST NEIGHBORS FOR SPEAKER VERIFICATION
Khan, Umair
India, Miquel
Hernando, Javier
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7574 - 7578
[7] Speaker recognition based on short utterance compensation method of generative adversarial networks
Hu, Zhangfang
Fu, Yaqin
Luo, Yuan
Xu, Xuan
Xia, Zhiguang
Zhang, Hongwei
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 443 - 450
[8] Speaker recognition based on short utterance compensation method of generative adversarial networks
Zhangfang Hu
Yaqin Fu
Yuan Luo
Xuan Xu
Zhiguang Xia
Hongwei Zhang
[J]. International Journal of Speech Technology, 2020, 23 : 443 - 450
[9] Minimax i-vector extractor for short duration speaker verification
Hautamaki, Ville
Cheng, You-Chi
Rajan, Padmanabhan
Lee, Chin-Hui
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
[10] Improving Short Utterance based I-vector Speaker Recognition using Source and Utterance-Duration Normalization Techniques
Kanagasundaram, A.
Dean, D.
Gonzalez-Dominguez, J.
Sridharan, S.
Ramos, D.
Gonzalez-Rodriguez, J.
[J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2464 - 2468

← 1 2 3 4 5 →