Deepfake Audio Detection via MFCC Features Using Machine Learning

被引:0
|
作者
Hamza, Ameer [1 ]
Javed, Abdul Rehman Rehman [2 ,3 ]
Iqbal, Farkhund [4 ]
Kryvinska, Natalia [5 ]
Almadhor, Ahmad S. [6 ]
Jalil, Zunera [2 ]
Borghol, Rouba [7 ]
机构
[1] Air University, Faculty of Computing and AI, Islamabad,44000, Pakistan
[2] Air University, Department of Cyber Security, Islamabad,44000, Pakistan
[3] Lebanese American University, Department of Electrical and Computer Engineering, Byblos, Lebanon
[4] Zayed University, College of Technological Innovation, Abu Dhabi, United Arab Emirates
[5] Comenius University in Bratislava, Faculty of Management, Department of Information Systems, Bratislava,82005, Slovakia
[6] Jouf University, College of Computer and Information Sciences, Sakaka,72388, Saudi Arabia
[7] Rochester Institute of Technology of Dubai, Dubai, United Arab Emirates
关键词
Audio acoustics - Deep learning - Learning algorithms - Speech recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Deepfake content is created or altered synthetically using artificial intelligence (AI) approaches to appear real. It can include synthesizing audio, video, images, and text. Deepfakes may now produce natural-looking content, making them harder to identify. Much progress has been achieved in identifying video deepfakes in recent years; nevertheless, most investigations in detecting audio deepfakes have employed the ASVSpoof or AVSpoof dataset and various machine learning, deep learning, and deep learning algorithms. This research uses machine and deep learning-based approaches to identify deepfake audio. Mel-frequency cepstral coefficients (MFCCs) technique is used to acquire the most useful information from the audio. We choose the Fake-or-Real dataset, which is the most recent benchmark dataset. The dataset was created with a text-to-speech model and is divided into four sub-datasets: for-rece, for-2-sec, for-norm and for-original. These datasets are classified into sub-datasets mentioned above according to audio length and bit rate. The experimental results show that the support vector machine (SVM) outperformed the other machine learning (ML) models in terms of accuracy on for-rece and for-2-sec datasets, while the gradient boosting model performed very well using for-norm dataset. The VGG-16 model produced highly encouraging results when applied to the for-original dataset. The VGG-16 model outperforms other state-of-the-art approaches. © 2013 IEEE.
引用
收藏
页码:134018 / 134028
相关论文
共 50 条
  • [1] Deepfake Audio Detection via MFCC Features Using Machine Learning
    Hamza, Ameer
    Javed, Abdul Rehman
    Iqbal, Farkhund
    Kryvinska, Natalia
    Almadhor, Ahmad S. S.
    Jalil, Zunera
    Borghol, Rouba
    IEEE ACCESS, 2022, 10 : 134018 - 134028
  • [2] Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques
    Gourisaria M.K.
    Agrawal R.
    Sahni M.
    Singh P.K.
    Discover Internet of Things, 2024, 4 (01):
  • [3] A Deep Learning Framework for Audio Deepfake Detection
    Janavi Khochare
    Chaitali Joshi
    Bakul Yenarkar
    Shraddha Suratkar
    Faruk Kazi
    Arabian Journal for Science and Engineering, 2022, 47 : 3447 - 3458
  • [4] Audio-visual deepfake detection using articulatory representation learning
    Wang, Yujia
    Huang, Hua
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 248
  • [5] A Deep Learning Framework for Audio Deepfake Detection
    Khochare, Janavi
    Joshi, Chaitali
    Yenarkar, Bakul
    Suratkar, Shraddha
    Kazi, Faruk
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (03) : 3447 - 3458
  • [6] Research on Pornographic Audio Detection Algorithm Using MFCC Features and Vector Quantization
    Yu, Yan-shan
    Qu, Zhi-yi
    Yu, Zhen-dong
    INTERNATIONAL CONFERENCE ON COMPUTER, NETWORK SECURITY AND COMMUNICATION ENGINEERING (CNSCE 2014), 2014, : 53 - 58
  • [7] Machine Learning Inspired Efficient Audio Drone Detection using Acoustic Features
    Salman, Soha
    Mir, Junaid
    Farooq, Muhammad Tallal
    Malik, Aneeqa Noor
    Haleemdeen, Rizki
    PROCEEDINGS OF 2021 INTERNATIONAL BHURBAN CONFERENCE ON APPLIED SCIENCES AND TECHNOLOGIES (IBCAST), 2021, : 335 - 339
  • [8] Domain Generalization via Aggregation and Separation for Audio Deepfake Detection
    Xie, Yuankun
    Cheng, Haonan
    Wang, Yutian
    Ye, Long
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 344 - 358
  • [9] Speech Audio Deepfake Detection via Convolutional Neural Networks
    Valente, Lucas P.
    de Souza, Marcelo M. S.
    da Rocha, Alan M.
    IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS 2024, IEEE EAIS 2024, 2024, : 382 - 387
  • [10] Efficient Deepfake Audio Detection Using Spectro-Temporal Analysis and Deep Learning
    Sunkari, Venkateswarlu
    Srinagesh, A.
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (05) : 10 - 18