Detecting Audio Deepfakes: Integrating CNN and BiLSTM with Multi-Feature Concatenation

被引:0
|
作者
Wani, Taiba Majid [1 ]
Qadri, Syed Asif Ahmad [2 ]
Comminiello, Danilo [1 ]
Amerini, Irene [1 ]
机构
[1] Sapienza Univ Rome, Rome, Italy
[2] Natl Tsing Hua Univ, Hsinchu, Taiwan
关键词
Audio Deepfakes; Feature Concatenation; MFCC; CQCC; CQT; Mel spectrograms; CNN; BiLSTM;
D O I
10.1145/3658664.3659647
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Audio deepfake detection is emerging as a crucial field in digital media, as distinguishing real audio from deepfakes becomes increasingly challenging due to the advancement of deepfake technologies. These methods threaten information authenticity and pose serious security risks. Addressing this challenge, we propose a novel architecture that combines Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) for effective deepfake audio detection. Our approach is distinguished by the feature concatenation of a comprehensive set of acoustic features: Mel Frequency Cepstral Coefficients (MFCC), Mel spectrograms, Constant Q Cepstral Coefficients (CQCC), and Constant-Q Transform (CQT) vectors. In the proposed architecture, features processed by a CNN are concatenated into two multi-dimensional features for comprehensive analysis, then analyzed by a BiLSTM network to capture temporal dynamics and contextual dependencies in audio data. This synergistic method ensures an understanding of both spatial and sequential audio characteristics. We validate our model on the ASVSpoof 2019 and FoR datasets, using accuracy and Equal Error Rate (EER) metrics for the evaluation.
引用
收藏
页码:271 / 276
页数:6
相关论文
共 50 条
  • [31] MULTI-FEATURE SUBSPACE ANALYSIS FOR AUDIO-VIDOE BASED MULTI-MODAL PERSON RECOGNITION
    Gong, Dihong
    Li, Na
    Li, Zhifeng
    Qiao, Yu
    2014 4TH IEEE INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2014, : 776 - 779
  • [32] A hybrid CNN and BLSTM network for human complex activity recognition with multi-feature fusion
    Ruohong Huan
    Ziwei Zhan
    Luoqi Ge
    Kaikai Chi
    Peng Chen
    Ronghua Liang
    Multimedia Tools and Applications, 2021, 80 : 36159 - 36182
  • [33] Multi-feature fusion: Graph neural network and CNN combining for hyperspectral image classification
    Ding, Yao
    Zhang, Zhili
    Zhao, Xiaofeng
    Hong, Danfeng
    Cai, Wei
    Yu, Chengguo
    Yang, Nengjun
    Cai, Weiwei
    NEUROCOMPUTING, 2022, 501 : 246 - 257
  • [34] A BiLSTM-Based Feature Fusion With CNN Model: Integrating Smartphone Sensor Data for Pedestrian Activity Recognition
    Sabah, Rana
    Lam, Meng Chun
    Qamar, Faizan
    Zaidan, B. B.
    IEEE ACCESS, 2024, 12 : 142957 - 142978
  • [35] Aerial military target detection algorithm based on multi-feature cross fusion and cross-layer concatenation
    Gao W.
    Yang T.
    Li L.
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2023, 41 (06): : 1179 - 1189
  • [36] A Multimodal Sentiment Analysis Method Integrating Multi-Layer Attention Interaction and Multi-Feature Enhancement
    Xie, Shengfeng
    Li, Jingwei
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2023, 17 (01)
  • [37] Robust visual tracker integrating adaptively foreground segmentation into multi-feature fusion framework
    Zhang, Yi
    Liu, Guixi
    Gao, Jiayu
    Zhang, Haoyang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (43-44) : 31865 - 31888
  • [38] Robust visual tracker integrating adaptively foreground segmentation into multi-feature fusion framework
    Yi Zhang
    Guixi Liu
    Jiayu Gao
    Haoyang Zhang
    Multimedia Tools and Applications, 2020, 79 : 31865 - 31888
  • [39] BILSTM-SimAM: An improved algorithm for short-term electric load forecasting based on multi-feature
    Chen M.
    Qiu F.
    Xiong X.
    Chang Z.
    Wei Y.
    Wu J.
    Mathematical Biosciences and Engineering, 2024, 21 (02) : 2323 - 2343
  • [40] Simultaneous Segmentation and Classification of Bone Surfaces from Ultrasound Using a Multi-feature Guided CNN
    Wang, Puyang
    Patel, Vishal M.
    Hacihaliloglu, Ilker
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2018, PT IV, 2018, 11073 : 134 - 142