AntiFake: Using Adversarial Audio to Prevent Unauthorized Speech Synthesis

被引:3
|
作者
Yu, Zhiyuan [1 ]
Zhai, Shixuan [1 ]
Zhang, Ning [1 ]
机构
[1] Washington Univ, St Louis, MO 63110 USA
来源
PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023 | 2023年
关键词
Adversarial Machine Learning; Generative AI; Speech Synthesis; DeepFake Defense;
D O I
10.1145/3576915.3623209
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid development of deep neural networks and generative AI has catalyzed growth in realistic speech synthesis. While this technology has great potential to improve lives, it also leads to the emergence of "DeepFake" where synthesized speech can be misused to deceive humans and machines for nefarious purposes. In response to this evolving threat, there has been a significant amount of interest in mitigating this threat by DeepFake detection. Complementary to the existing work, we propose to take the preventative approach and introduce AntiFake, a defense mechanism that relies on adversarial examples to prevent unauthorized speech synthesis. To ensure the transferability to attackers' unknown synthesis models, an ensemble learning approach is adopted to improve the generalizability of the optimization process. To validate the efficacy of the proposed system, we evaluated AntiFake against five state-of-the-art synthesizers using real-world DeepFake speech samples. The experiments indicated that AntiFake achieved over 95% protection rate even to unknown black-box models. We have also conducted usability tests involving 24 human participants to ensure the solution is accessible to diverse populations.
引用
收藏
页码:460 / 474
页数:15
相关论文
共 50 条
  • [21] Using Biometric Key Commitments to Prevent Unauthorized Lending of Cryptographic Credentials
    Bissessar, David
    Adams, Carlisle
    Liu, Dong
    2014 TWELFTH ANNUAL INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY AND TRUST (PST), 2014, : 75 - 83
  • [22] METHOD OF SPEECH SYNTHESIS FOR MULTIPLEXED AUDIO RESPONSE
    NAKATA, K
    MIURA, T
    ELECTRONICS & COMMUNICATIONS IN JAPAN, 1969, 52 (10): : 126 - &
  • [23] NEURAL AUDIO DECORRELATION USING GENERATIVE ADVERSARIAL NETWORKS
    Anemuller, Carlotta
    Thiergart, Oliver
    Habets, Emanuel A. P.
    2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [24] RETRACTED: Detecting adversarial attacks on audio-visual speech recognition using deep learning method (Retracted Article)
    Ramadan, Rabie A.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 25 (3) : 625 - 631
  • [25] ALIF: Low-Cost Adversarial Audio Attacks on Black-Box Speech Platforms using Linguistic Features
    Cheng, Peng
    Wang, Yuwei
    Huang, Peng
    Ba, Zhongjie
    Lin, Xiaodong
    Lin, Feng
    Lu, Li
    Ren, Kui
    45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024, 2024, : 1628 - 1645
  • [26] Multi-targeted audio adversarial example for use against speech recognition systems
    Ko, Kyoungmin
    Kim, SungHwan
    Kwon, Hyun
    COMPUTERS & SECURITY, 2023, 128
  • [27] Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
    Yang, Karren
    Markovic, Dejan
    Krenn, Steven
    Agrawal, Vasu
    Richard, Alexander
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8217 - 8227
  • [28] A Detection Algorithm for Audio Adversarial Examples in EI-Enhanced Automatic Speech Recognition
    Huang, Ying
    Liu, Jie
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [29] End-to-End Video-to-Speech Synthesis Using Generative Adversarial Networks
    Mira, Rodrigo
    Vougioukas, Konstantinos
    Ma, Pingchuan
    Petridis, Stavros
    Schuller, Bjoern W.
    Pantic, Maja
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (06) : 3454 - 3466
  • [30] Using Automatic Stress Extraction from Audio for Improved Prosody Modelling in Speech Synthesis
    Szaszak, Gyorgy
    Beke, Andras
    Olaszy, Gabor
    Toth, Balint Pal
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 2227 - 2231