AntiFake: Using Adversarial Audio to Prevent Unauthorized Speech Synthesis

被引：3

作者：

Yu, Zhiyuan ^{[1
]}

Zhai, Shixuan ^{[1
]}

Zhang, Ning ^{[1
]}

机构：

[1] Washington Univ, St Louis, MO 63110 USA

来源：

PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023 | 2023年

关键词：

Adversarial Machine Learning; Generative AI; Speech Synthesis; DeepFake Defense;

D O I：

10.1145/3576915.3623209

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The rapid development of deep neural networks and generative AI has catalyzed growth in realistic speech synthesis. While this technology has great potential to improve lives, it also leads to the emergence of "DeepFake" where synthesized speech can be misused to deceive humans and machines for nefarious purposes. In response to this evolving threat, there has been a significant amount of interest in mitigating this threat by DeepFake detection. Complementary to the existing work, we propose to take the preventative approach and introduce AntiFake, a defense mechanism that relies on adversarial examples to prevent unauthorized speech synthesis. To ensure the transferability to attackers' unknown synthesis models, an ensemble learning approach is adopted to improve the generalizability of the optimization process. To validate the efficacy of the proposed system, we evaluated AntiFake against five state-of-the-art synthesizers using real-world DeepFake speech samples. The experiments indicated that AntiFake achieved over 95% protection rate even to unknown black-box models. We have also conducted usability tests involving 24 human participants to ensure the solution is accessible to diverse populations.

引用

页码：460 / 474

页数：15

共 50 条

[21] Using Biometric Key Commitments to Prevent Unauthorized Lending of Cryptographic Credentials
Bissessar, David
Adams, Carlisle
Liu, Dong
2014 TWELFTH ANNUAL INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST (PST), 2014, : 75 - 83
[22] METHOD OF SPEECH SYNTHESIS FOR MULTIPLEXED AUDIO RESPONSE
NAKATA, K
MIURA, T
ELECTRONICS & COMMUNICATIONS IN JAPAN, 1969, 52 (10): : 126 - &
[23] NEURAL AUDIO DECORRELATION USING GENERATIVE ADVERSARIAL NETWORKS
Anemuller, Carlotta
Thiergart, Oliver
Habets, Emanuel A. P.
2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
[24] RETRACTED: Detecting adversarial attacks on audio-visual speech recognition using deep learning method (Retracted Article)
Ramadan, Rabie A.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 25 (3) : 625 - 631
[25] ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features
Cheng, Peng
Wang, Yuwei
Huang, Peng
Ba, Zhongjie
Lin, Xiaodong
Lin, Feng
Lu, Li
Ren, Kui
45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 1628 - 1645
[26] Multi-targeted audio adversarial example for use against speech recognition systems
Ko, Kyoungmin
Kim, SungHwan
Kwon, Hyun
COMPUTERS & SECURITY, 2023, 128
[27] Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Yang, Karren
Markovic, Dejan
Krenn, Steven
Agrawal, Vasu
Richard, Alexander
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8217 - 8227
[28] A Detection Algorithm for Audio Adversarial Examples in EI-Enhanced Automatic Speech Recognition
Huang, Ying
Liu, Jie
WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
[29] End-to-End Video-to-Speech Synthesis Using Generative Adversarial Networks
Mira, Rodrigo
Vougioukas, Konstantinos
Ma, Pingchuan
Petridis, Stavros
Schuller, Bjoern W.
Pantic, Maja
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (06) : 3454 - 3466
[30] Using Automatic Stress Extraction from Audio for Improved Prosody Modelling in Speech Synthesis
Szaszak, Gyorgy
Beke, Andras
Olaszy, Gabor
Toth, Balint Pal
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2227 - 2231

← 1 2 3 4 5 →