FTDKD: Frequency-Time Domain Knowledge Distillation for Low-Quality Compressed Audio Deepfake Detection

被引:0
|
作者
Wang, Bo [1 ]
Tang, Yeling [1 ]
Wei, Fei [2 ]
Ba, Zhongjie [3 ]
Ren, Kui [3 ]
机构
[1] Dalian Univ Technol, Sch Informat & Commun Engn, Dalian 116081, Peoples R China
[2] Alibaba Grp, Hangzhou 311121, Zhejiang, Peoples R China
[3] Zhejiang Univ, Sch Cyber Sci & Technol, Hangzhou 310027, Zhejiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Audio deepfake detection; low-quality compressed audio; knowledge distillation;
D O I
10.1109/TASLP.2024.3492796
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In recent years, the field of audio deepfake detection has witnessed significant advancements. Nonetheless, the majority of solutions have concentrated on high-quality audio, largely overlooking the challenge of low-quality compressed audio in real-world scenarios. Low-quality compressed audio typically suffers from a loss of high-frequency details and time-domain information, which significantly undermines the performance of advanced deepfake detection systems when confronted with such data. In this paper, we introduce a deepfake detection model that employs knowledge distillation across the frequency and time domains. Our approach aims to train a teacher model with high-quality data and a student model with low-quality compressed data. Subsequently, we implement frequency-domain and time-domain distillation to facilitate the student model's learning of high-frequency information and time-domain details from the teacher model. Experimental evaluations on the ASVspoof 2019 LA and ASVspoof 2021 DF datasets illustrate the effectiveness of our methodology. On the ASVspoof 2021 DF dataset, which consists of low-quality compressed audio, we achieved an Equal Error Rate (EER) of 2.82%. To our knowledge, this performance is the best among all deepfake voice detection systems tested on the ASVspoof 2021 DF dataset. Additionally, our method proves to be versatile, showing notable performance on high-quality data with an EER of 0.30% on the ASVspoof 2019 LA dataset, closely approaching state-of-the-art results.
引用
收藏
页码:4905 / 4918
页数:14
相关论文
共 37 条
  • [31] Time-unfolding Object Existence Detection in Low-quality Underwater Videos using Convolutional Neural Networks
    Toedtmann, Helmut
    Vahl, Matthias
    von Lukas, Uwe Freiherr
    Ullrich, Torsten
    PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 370 - 377
  • [32] Object detection with dynamic high-/low-frequency knowledge distillation for real-world degradation
    Zhao, Junyi
    Li, Jinbao
    Chen, Xiandong
    Li, Shuang
    ALEXANDRIA ENGINEERING JOURNAL, 2025, 124 : 110 - 120
  • [33] Entropy Measures Applied on Time-Frequency Domain for Detection and Identification of Power Quality Disturbances
    Darambazar, Gandorj
    Moukadem, Ali
    Colicchio, Bruno
    Wira, Patrice
    2022 20TH INTERNATIONAL CONFERENCE ON HARMONICS & QUALITY OF POWER (ICHQP 2022), 2022,
  • [34] DNS Tunnel Detection for Low Throughput Data Exfiltration via Time-Frequency Domain Analysis
    Liu, Xiaoyu
    Mao, Weixuan
    Wang, Anqi
    Li, Zheng
    Xue, Hui
    Zhang, Yijing
    Lin, Jianjun
    Yang, Xiaodu
    Chen, Ziqian
    Sun, Bo
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 2331 - 2337
  • [35] P-S Travel-Time Detection and Hypocenter Location of Low-SNR Events Using Polarization in the Time-Frequency - Frequency Domain
    Sun, Jingyi
    Mukuhira, Yusuke
    Nagata, Takayuki
    Nonomura, Taku
    Fehler, Michael C.
    Moriya, Hirokazu
    Nakata, Nori
    Ito, Takatoshi
    BULLETIN OF THE SEISMOLOGICAL SOCIETY OF AMERICA, 2024, 114 (05) : 2359 - 2375
  • [36] Gait Event Detection in Real-World Environment for Long-Term Applications: Incorporating Domain Knowledge Into Time-Frequency Analysis
    Khandelwal, Siddhartha
    Wickstrom, Nicholas
    IEEE TRANSACTIONS ON NEURAL SYSTEMS AND REHABILITATION ENGINEERING, 2016, 24 (12) : 1363 - 1372
  • [37] M2SKD: Multi-to-Single Knowledge Distillation of Real-Time Epileptic Seizure Detection for Low-Power Wearable Systems
    Baghersalimi, Saleh
    Amirshahi, Alireza
    Forooghifar, Farnaz
    Teijeiro, Tomas
    Aminifar, Amir
    Atienza, David
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2024, 15 (05)