The Catcher in the Field: A Fieldprint based Spoofing Detection for Text-Independent Speaker Verification

被引:37
|
作者
Yan, Chen [1 ]
Long, Yan [1 ]
Ji, Xiaoyu [1 ]
Xu, Wenyuan [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
基金
国家重点研发计划;
关键词
fieldprint; speaker verification; spoofing attack; sound field; SPEECH; DIRECTIVITY; RECOGNITION; NOISE;
D O I
10.1145/3319535.3354248
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Verifying the identity of voice inputs is important as voices are increasingly used for sensitive operations. Traditional methods focus on differentiating individuals via the spectrographic features of voices (e.g., voiceprint), yet cannot cope with spoofing attacks, whereby a malicious attacker synthesizes the voice with almost the same voiceprint of a victim or simply replays it. This paper proposes CaField, a text-independent speaker verification method to detect loudspeaker:based voice spoofing attacks with the goal of achieving two seemingly conflicting requirements: usability and security. The key insight of CaField is to construct "fieldprint" with the acoustic biometrics embedded in sound fields, i.e., a physical field of acoustic energy created as the sound propagates over the air, as analogous to "voiceprint". We find that fieldprints can be distinctive between speakers (either humans or loudspeakers), and thus we may detect the speakers being used for spoofing attacks from the authentic users. Our evaluation on a dataset of 20 people and 8 loudspeakers shows that by relying on two on-board microphones to sample sound fields while users talk to the smartphones, CaField achieves a detection accuracy of 99.16% and an equal error rate (EER) of 0.85% across multiple sessions and various voice inputs. CaField supports low audio sample rates at 8 kHz and is robust to various factors including phone displacement, user posture, recording environment, etc.
引用
下载
收藏
页码:1215 / 1229
页数:15
相关论文
共 50 条
  • [41] FACTORED COVARIANCE MODELING FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Wang, Eryu
    Lee, Kong Aik
    Ma, Bin
    Li, Haizhou
    Guo, Wu
    Dai, Lirong
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4856 - 4859
  • [42] Exploration of Local Variability in Text-Independent Speaker Verification
    Liping Chen
    Kong Aik Lee
    Bin Ma
    Wu Guo
    Haizhou Li
    Li-Rong Dai
    Journal of Signal Processing Systems, 2016, 82 : 217 - 228
  • [43] Text-independent speaker verification using covariance modeling
    Zilca, RD
    IEEE SIGNAL PROCESSING LETTERS, 2001, 8 (04) : 97 - 99
  • [44] Text-independent speaker verification with dynamic trajectory model
    Xiang, B
    IEEE SIGNAL PROCESSING LETTERS, 2003, 10 (05) : 141 - 143
  • [45] A ROBUST TEXT-INDEPENDENT SPEAKER VERIFICATION METHOD BASED ON SPEECH SEPARATION AND DEEP SPEAKER
    Zhao, Fei
    Li, Hao
    Zhang, Xueliang
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6101 - 6105
  • [46] A CORRECTIVE LEARNING APPROACH FOR TEXT-INDEPENDENT SPEAKER VERIFICATION
    Wen, Yandong
    Zhou, Tianyan
    Singh, Rita
    Raj, Bhiksha
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4894 - 4898
  • [47] Self-Attentive Speaker Embeddings for Text-Independent Speaker Verification
    Zhu, Yingke
    Ko, Tom
    Snyder, David
    Mak, Brian
    Povey, Daniel
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3573 - 3577
  • [48] Speaker adaptive cohort selection for Tnorm in text-independent speaker verification
    Sturim, DE
    Reynolds, DA
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 741 - 744
  • [49] Significance of Constraining Text in Limited Data Text-independent Speaker Verification
    Das, Rohan Kumar
    Jelil, Sarfaraz
    Prasanna, S. R. Mahadeva
    2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2016,
  • [50] Text-Independent Speaker Verification Based on Triplet Convolutional Neural Network Embeddings
    Zhang, Chunlei
    Koishida, Kazuhito
    Hansen, John H. L.
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (09) : 1633 - 1644