The Catcher in the Field: A Fieldprint based Spoofing Detection for Text-Independent Speaker Verification

被引：37

作者：

Yan, Chen ^{[1
]}

Long, Yan ^{[1
]}

Ji, Xiaoyu ^{[1
]}

Xu, Wenyuan ^{[1
]}

机构：

[1] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China

来源：

PROCEEDINGS OF THE 2019 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (CCS'19) | 2019年

基金：

国家重点研发计划;

关键词：

fieldprint; speaker verification; spoofing attack; sound field; SPEECH; DIRECTIVITY; RECOGNITION; NOISE;

D O I：

10.1145/3319535.3354248

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Verifying the identity of voice inputs is important as voices are increasingly used for sensitive operations. Traditional methods focus on differentiating individuals via the spectrographic features of voices (e.g., voiceprint), yet cannot cope with spoofing attacks, whereby a malicious attacker synthesizes the voice with almost the same voiceprint of a victim or simply replays it. This paper proposes CaField, a text-independent speaker verification method to detect loudspeaker:based voice spoofing attacks with the goal of achieving two seemingly conflicting requirements: usability and security. The key insight of CaField is to construct "fieldprint" with the acoustic biometrics embedded in sound fields, i.e., a physical field of acoustic energy created as the sound propagates over the air, as analogous to "voiceprint". We find that fieldprints can be distinctive between speakers (either humans or loudspeakers), and thus we may detect the speakers being used for spoofing attacks from the authentic users. Our evaluation on a dataset of 20 people and 8 loudspeakers shows that by relying on two on-board microphones to sample sound fields while users talk to the smartphones, CaField achieves a detection accuracy of 99.16% and an equal error rate (EER) of 0.85% across multiple sessions and various voice inputs. CaField supports low audio sample rates at 8 kHz and is robust to various factors including phone displacement, user posture, recording environment, etc.

引用

页码：1215 / 1229

页数：15

共 50 条

[1] Integrated Replay Spoofing-Aware Text-Independent Speaker Verification
Shim, Hye-jin
Jung, Jee-weon
Kim, Ju-ho
Yu, Ha-jin
[J]. APPLIED SCIENCES-BASEL, 2020, 10 (18):
[2] A Phoneme Localization Based Liveness Detection for Text-Independent Speaker Verification
Zhang, Linghan
Tan, Sheng
Chen, Yingying
Yang, Jie
[J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2023, 22 (10) : 5611 - 5624
[3] A tutorial on text-independent speaker verification
[J]. Bimbot, F. (bimbot@irisa.fr), 1600, Hindawi Publishing Corporation (2004):
[4] A tutorial on text-independent speaker verification
Bimbot, F
Bonastre, JF
Fredouille, C
Gravier, G
Magrin-Chagnolleau, I
Meignier, S
Merlin, T
Ortega-García, J
Petrovska-Delacrétaz, D
Reynolds, DA
[J]. EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (04) : 430 - 451
[5] A Tutorial on Text-Independent Speaker Verification
Frédéric Bimbot
Jean-François Bonastre
Corinne Fredouille
Guillaume Gravier
Ivan Magrin-Chagnolleau
Sylvain Meignier
Teva Merlin
Javier Ortega-García
Dijana Petrovska-Delacrétaz
Douglas A. Reynolds
[J]. EURASIP Journal on Advances in Signal Processing, 2004
[6] Text-Independent Speaker Verification Based on Triplet Loss
He, Junjie
He, Jing
Zhu, Liangjin
[J]. PROCEEDINGS OF 2020 IEEE 4TH INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2020), 2020, : 2385 - 2388
[7] Group-based speaker embeddings for text-independent speaker verification
Jung, Youngmoon
Eom, Youngsik
Lee, Yeonghyeon
Kim, Hoirin
[J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2021, 40 (05): : 496 - 502
[8] Graphical models for text-independent speaker verification
Sánchez-Soto, E
Sigelle, M
Chollet, G
[J]. NONLINEAR SPEECH MODELING AND APPLICATIONS, 2005, 3445 : 410 - 415
[9] Language dependency in text-independent speaker verification
Auckenthaler, R
Carey, MJ
Mason, JSD
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING, 2001, : 441 - 444
[10] Text-Independent Speaker Verification Based on Information Theoretic Learning
Memon, Sheeraz
Khanzada, Tariq Jameel Saifullah
Bhatti, Sania
[J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2011, 30 (03) : 457 - 468

← 1 2 3 4 5 →