On the study of replay and voice conversion attacks to text-dependent speaker verification

被引:37
|
作者
Wu, Zhizheng [1 ]
Li, Haizhou [2 ]
机构
[1] Univ Edinburgh, CSTR, Edinburgh, Midlothian, Scotland
[2] Inst Infocomm Res, Human Language Technol Dept, Singapore, Singapore
关键词
Speaker verification; Spoofing attack; Replay; Voice conversion; Security; RECOGNITION; SECURITY; ADAPTATION;
D O I
10.1007/s11042-015-3080-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Automatic speaker verification (ASV) is to automatically accept or reject a claimed identity based on a speech sample. Recently, individual studies have confirmed the vulnerability of state-of-the-art text-independent ASV systems under replay, speech synthesis and voice conversion attacks on various databases. However, the behaviours of text-dependent ASV systems have not been systematically assessed in the face of various spoofing attacks. In this work, we first conduct a systematic analysis of text-dependent ASV systems to replay and voice conversion attacks using the same protocol and database, in particular the RSR2015 database which represents mobile device quality speech. We then analyse the interplay of voice conversion and speaker verification by linking the voice conversion objective evaluation measures with the speaker verification error rates to take a look at the vulnerabilities from the perspective of voice conversion.
引用
收藏
页码:5311 / 5327
页数:17
相关论文
共 50 条
  • [41] A Phonetic Alternative to Cross-language Voice Conversion in a Text-dependent Context: Evaluation of Speaker Identity
    Yanagisawa, Kayoko
    Huckvale, Mark
    [J]. 11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2150 - 2153
  • [42] Text-dependent speaker verification: Classifiers, databases and RSR2015
    Larcher, Anthony
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    [J]. SPEECH COMMUNICATION, 2014, 60 : 56 - 77
  • [43] Parameterization of the score threshold for a text-dependent adaptive speaker verification system
    Mirghafori, N
    Hébert, M
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 361 - 364
  • [44] DEEP NEURAL NETWORKS FOR SMALL FOOTPRINT TEXT-DEPENDENT SPEAKER VERIFICATION
    Variani, Ehsan
    Lei, Xin
    McDermott, Erik
    Moreno, Ignacio Lopez
    Gonzalez-Dominguez, Javier
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [45] Improving X-vector and PLDA for Text-dependent Speaker Verification
    Chen, Zhuxin
    Lin, Yue
    [J]. INTERSPEECH 2020, 2020, : 726 - 730
  • [46] DNN BASED SPEAKER EMBEDDING USING CONTENT INFORMATION FOR TEXT-DEPENDENT SPEAKER VERIFICATION
    Dey, Subhadeep
    Koshinaka, Takafumi
    Motlicek, Petr
    Madikeri, Srikanth
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5344 - 5348
  • [47] END-TO-END ATTENTION BASED TEXT-DEPENDENT SPEAKER VERIFICATION
    Zhang, Shi-Xiong
    Chen, Zhuo
    Zhao, Yong
    Li, Jinyu
    Gong, Yifan
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 171 - 178
  • [48] Lexicon-Based Local Representation for Text-Dependent Speaker Verification
    You, Hanxu
    Li, Wei
    Li, Lianqiang
    Zhu, Jie
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2017, E100D (03): : 587 - 589
  • [49] Text-dependent Speaker Verification Using Word-based Scoring
    Yao, Shengyu
    Huang, Houjun
    Zhou, Ruohua
    Yan, Yonghong
    [J]. 2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 314 - 318
  • [50] APPLICATION OF DYNAMIC TIME WARPING AND CEPSTROGRAMS TO TEXT-DEPENDENT SPEAKER VERIFICATION
    Kaczmarek, Andrzej
    Staworko, Michal
    [J]. SPA 2009: SIGNAL PROCESSING ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS CONFERENCE PROCEEDINGS, 2009, : 169 - +