Deepfake Audio Detection via MFCC Features Using Machine Learning

被引:0
|
作者
Hamza, Ameer [1 ]
Javed, Abdul Rehman Rehman [2 ,3 ]
Iqbal, Farkhund [4 ]
Kryvinska, Natalia [5 ]
Almadhor, Ahmad S. [6 ]
Jalil, Zunera [2 ]
Borghol, Rouba [7 ]
机构
[1] Air University, Faculty of Computing and AI, Islamabad,44000, Pakistan
[2] Air University, Department of Cyber Security, Islamabad,44000, Pakistan
[3] Lebanese American University, Department of Electrical and Computer Engineering, Byblos, Lebanon
[4] Zayed University, College of Technological Innovation, Abu Dhabi, United Arab Emirates
[5] Comenius University in Bratislava, Faculty of Management, Department of Information Systems, Bratislava,82005, Slovakia
[6] Jouf University, College of Computer and Information Sciences, Sakaka,72388, Saudi Arabia
[7] Rochester Institute of Technology of Dubai, Dubai, United Arab Emirates
关键词
Audio acoustics - Deep learning - Learning algorithms - Speech recognition;
D O I
暂无
中图分类号
学科分类号
摘要
Deepfake content is created or altered synthetically using artificial intelligence (AI) approaches to appear real. It can include synthesizing audio, video, images, and text. Deepfakes may now produce natural-looking content, making them harder to identify. Much progress has been achieved in identifying video deepfakes in recent years; nevertheless, most investigations in detecting audio deepfakes have employed the ASVSpoof or AVSpoof dataset and various machine learning, deep learning, and deep learning algorithms. This research uses machine and deep learning-based approaches to identify deepfake audio. Mel-frequency cepstral coefficients (MFCCs) technique is used to acquire the most useful information from the audio. We choose the Fake-or-Real dataset, which is the most recent benchmark dataset. The dataset was created with a text-to-speech model and is divided into four sub-datasets: for-rece, for-2-sec, for-norm and for-original. These datasets are classified into sub-datasets mentioned above according to audio length and bit rate. The experimental results show that the support vector machine (SVM) outperformed the other machine learning (ML) models in terms of accuracy on for-rece and for-2-sec datasets, while the gradient boosting model performed very well using for-norm dataset. The VGG-16 model produced highly encouraging results when applied to the for-original dataset. The VGG-16 model outperforms other state-of-the-art approaches. © 2013 IEEE.
引用
收藏
页码:134018 / 134028
相关论文
共 50 条
  • [31] Retrieval-Augmented Audio Deepfake Detection
    Kang, Zuheng
    He, Yayun
    Zhao, Botao
    Qu, Xiaoyang
    Peng, Junqing
    Xiao, Jing
    Wang, Jianzong
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 376 - 384
  • [32] Joint Audio-Visual Deepfake Detection
    Zhou, Yipin
    Lim, Ser-Nam
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14780 - 14789
  • [33] Acoustic features analysis for explainable machine learning-based audio spoofing detection
    Bisogni, Carmen
    Loia, Vincenzo
    Nappi, Michele
    Pero, Chiara
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [34] Automatic hate speech detection in audio using machine learning algorithms
    Imbwaga J.L.
    Chittaragi N.B.
    Koolagudi S.G.
    International Journal of Speech Technology, 2024, 27 (02) : 447 - 469
  • [35] A machine learning approach for dyslexia detection using Turkish audio records
    Tas, Tugberk
    Bulbul, Muammed Abdullah
    Hasimoglu, Abas
    Meral, Yavuz
    Caliskan, Yasin
    Budagova, Gunay
    Kutlu, Mucahid
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2023, 31 (05) : 892 - 907
  • [36] Music Emotion Recognition with the Extraction of Audio Features Using Machine Learning Approaches
    Juthi, Jannatul Humayra
    Gomes, Anthony
    Bhuiyan, Touhid
    Mahmud, Imran
    PROCEEDINGS OF ICETIT 2019: EMERGING TRENDS IN INFORMATION TECHNOLOGY, 2020, 605 : 318 - 329
  • [37] Exploring Green AI for Audio Deepfake Detection
    Saha, Subhajit
    Sahidullah, Md
    Das, Swagatam
    32ND EUROPEAN SIGNAL PROCESSING CONFERENCE, EUSIPCO 2024, 2024, : 186 - 190
  • [38] Prosodic information extraction and classification based on MFCC features and machine learning models
    Gill, Sajid Habib
    Mahar, Javed Ahmed
    Mahar, Shahid Ali
    Razzaq, Mirza Abdur
    Mehmood, Arif
    Choi, Gyu Sang
    Ashraf, Imran
    MEASUREMENT & CONTROL, 2025,
  • [39] Joint Audio-Visual Attention with Contrastive Learning for More General Deepfake Detection
    Zhang, Yibo
    Lin, Weiguo
    Xu, Junfeng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)
  • [40] Machine learning based medical image deepfake detection: A comparative study
    Solaiyappan, Siddharth
    Wen, Yuxin
    MACHINE LEARNING WITH APPLICATIONS, 2022, 8