I-vector Transformation Using Conditional Generative Adversarial Networks for Short Utterance Speaker Verification

被引:9
|
作者
Zhang, Jiacen [1 ]
Inoue, Nakamasa [1 ]
Shinoda, Koichi [1 ]
机构
[1] Tokyo Inst Technol, Tokyo, Japan
关键词
speaker verification; short utterance; i-vector transformation; generative adversarial networks; multi-task learning;
D O I
10.21437/Interspeech.2018-1680
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
I-vector based text-independent speaker verification (SV) systems often have poor performance with short utterances, as the biased phonetic distribution in a short utterance makes the extracted i-vector unreliable. This paper proposes an i-vector compensation method using a generative adversarial network (GAN), where its generator network is trained to generate a compensated i-vector from a short-utterance i-vector and its discriminator network is trained to determine whether an i-vector is generated by the generator or the one extracted from a long utterance. Additionally, we assign two other learning tasks to the GAN to stabilize its training and to make the generated i-vector more speaker-specific. Speaker verification experiments on the NIST SRE 2008 "10sec-10sec" condition show that after applying our method, the equal error rate reduced by 11.3% from the conventional i-vector and PLDA system.
引用
收藏
页码:3613 / 3617
页数:5
相关论文
共 50 条
  • [1] An Adaptive i-Vector Extraction for Speaker Verification with Short Utterance
    Poddar, Arnab
    Sahidullah, Md
    Saha, Goutam
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2017, 2017, 10597 : 326 - 332
  • [2] Improving short utterance i-vector speaker verification using utterance variance modelling and compensation techniques
    Kanagasundaram, A.
    Dean, D.
    Sridharan, S.
    Gonzalez-Dominguez, J.
    Gonzalez-Rodriguez, J.
    Ramos, D.
    [J]. SPEECH COMMUNICATION, 2014, 59 : 69 - 82
  • [3] Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification
    Sarkar, A. K.
    Matrouf, D.
    Bousquet, P. M.
    Bonastre, J. F.
    [J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2661 - 2664
  • [4] Boosting the Performance of I-Vector Based Speaker Verification via Utterance Partitioning
    Rao, Wei
    Mak, Man-Wai
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (05): : 1012 - 1022
  • [5] ADVERSARIAL ATTACKS ON GMM I-VECTOR BASED SPEAKER VERIFICATION SYSTEMS
    Li, Xu
    Zhong, Jinghua
    Wu, Xixin
    Yu, Jianwei
    Liu, Xunying
    Meng, Helen
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6579 - 6583
  • [6] I-VECTOR TRANSFORMATION USING K-NEAREST NEIGHBORS FOR SPEAKER VERIFICATION
    Khan, Umair
    India, Miquel
    Hernando, Javier
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7574 - 7578
  • [7] Speaker recognition based on short utterance compensation method of generative adversarial networks
    Hu, Zhangfang
    Fu, Yaqin
    Luo, Yuan
    Xu, Xuan
    Xia, Zhiguang
    Zhang, Hongwei
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 443 - 450
  • [8] Speaker recognition based on short utterance compensation method of generative adversarial networks
    Zhangfang Hu
    Yaqin Fu
    Yuan Luo
    Xuan Xu
    Zhiguang Xia
    Hongwei Zhang
    [J]. International Journal of Speech Technology, 2020, 23 : 443 - 450
  • [9] Minimax i-vector extractor for short duration speaker verification
    Hautamaki, Ville
    Cheng, You-Chi
    Rajan, Padmanabhan
    Lee, Chin-Hui
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 3675 - 3679
  • [10] Improving Short Utterance based I-vector Speaker Recognition using Source and Utterance-Duration Normalization Techniques
    Kanagasundaram, A.
    Dean, D.
    Gonzalez-Dominguez, J.
    Sridharan, S.
    Ramos, D.
    Gonzalez-Rodriguez, J.
    [J]. 14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5, 2013, : 2464 - 2468