Deepfake Audio Detection via MFCC Features Using Machine Learning

被引:0
|
作者
Hamza, Ameer [1 ]
Javed, Abdul Rehman Rehman [2 ,3 ]
Iqbal, Farkhund [4 ]
Kryvinska, Natalia [5 ]
Almadhor, Ahmad S. [6 ]
Jalil, Zunera [2 ]
Borghol, Rouba [7 ]
机构
[1] Air University, Faculty of Computing and AI, Islamabad,44000, Pakistan
[2] Air University, Department of Cyber Security, Islamabad,44000, Pakistan
[3] Lebanese American University, Department of Electrical and Computer Engineering, Byblos, Lebanon
[4] Zayed University, College of Technological Innovation, Abu Dhabi, United Arab Emirates
[5] Comenius University in Bratislava, Faculty of Management, Department of Information Systems, Bratislava,82005, Slovakia
[6] Jouf University, College of Computer and Information Sciences, Sakaka,72388, Saudi Arabia
[7] Rochester Institute of Technology of Dubai, Dubai, United Arab Emirates
关键词
Audio acoustics - Deep learning - Learning algorithms - Speech recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Deepfake content is created or altered synthetically using artificial intelligence (AI) approaches to appear real. It can include synthesizing audio, video, images, and text. Deepfakes may now produce natural-looking content, making them harder to identify. Much progress has been achieved in identifying video deepfakes in recent years; nevertheless, most investigations in detecting audio deepfakes have employed the ASVSpoof or AVSpoof dataset and various machine learning, deep learning, and deep learning algorithms. This research uses machine and deep learning-based approaches to identify deepfake audio. Mel-frequency cepstral coefficients (MFCCs) technique is used to acquire the most useful information from the audio. We choose the Fake-or-Real dataset, which is the most recent benchmark dataset. The dataset was created with a text-to-speech model and is divided into four sub-datasets: for-rece, for-2-sec, for-norm and for-original. These datasets are classified into sub-datasets mentioned above according to audio length and bit rate. The experimental results show that the support vector machine (SVM) outperformed the other machine learning (ML) models in terms of accuracy on for-rece and for-2-sec datasets, while the gradient boosting model performed very well using for-norm dataset. The VGG-16 model produced highly encouraging results when applied to the for-original dataset. The VGG-16 model outperforms other state-of-the-art approaches. © 2013 IEEE.
引用
收藏
页码:134018 / 134028
相关论文
共 50 条
  • [41] Securing Voice Biometrics: One-Shot Learning Approach for Audio Deepfake Detection
    Khan, Awais
    Malik, Khalid Mahmood
    2023 IEEE INTERNATIONAL WORKSHOP ON INFORMATION FORENSICS AND SECURITY, WIFS, 2023,
  • [42] Recognition of isolated words using Zernike and MFCC features for audio visual speech recognition
    Borde, Prashant
    Varpe, Amarsinh
    Manza, Ramesh
    Yannawar, Pravin
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (02) : 167 - 175
  • [43] COVID-19 detection from optimized features of breathing audio signals using explainable ensemble machine learning
    Sultana, Shafrin
    Hossain, A. B. M. Aowlad
    Alam, Jahangir
    RESULTS IN CONTROL AND OPTIMIZATION, 2025, 18
  • [44] Multiclass Digital Audio Segmentation with MFCC Features using Naive Bayes and SVM Classifiers
    Iheme, Leonardo O.
    Ozan, Sukru
    2019 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2019, : 468 - 472
  • [45] Deepfake video detection using deep learning algorithms
    Korkmaz, Sahin
    Alkan, Mustafa
    JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI, 2023, 26 (02): : 855 - 862
  • [46] Review of audio deepfake detection techniques: Issues and prospects
    Dixit, Abhishek
    Kaur, Nirmal
    Kingra, Staffy
    EXPERT SYSTEMS, 2023, 40 (08)
  • [47] Defense Against Adversarial Attacks on Audio DeepFake Detection
    Kawa, Piotr
    Plata, Marcin
    Syga, Piotr
    INTERSPEECH 2023, 2023, : 5276 - 5280
  • [48] Audio Based Violent Scene Detection Using extreme Learning Machine Algorithm
    Mahalle, Mrunali D.
    Rojatkar, Dinesh, V
    2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [49] Correction: Automatic hate speech detection in audio using machine learning algorithms
    Joan L. Imbwaga
    Nagaratna B. Chittaragi
    Shashidhar G. Koolagudi
    International Journal of Speech Technology, 2025, 28 (1) : 313 - 313
  • [50] A lightweight feature extraction technique for deepfake audio detection
    Chakravarty, Nidhi
    Dua, Mohit
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (26) : 67443 - 67467