Detecting Audio Deepfakes: Integrating CNN and BiLSTM with Multi-Feature Concatenation

被引：0

作者：

Wani, Taiba Majid ^{[1
]}

Qadri, Syed Asif Ahmad ^{[2
]}

Comminiello, Danilo ^{[1
]}

Amerini, Irene ^{[1
]}

机构：

[1] Sapienza Univ Rome, Rome, Italy

[2] Natl Tsing Hua Univ, Hsinchu, Taiwan

来源：

PROCEEDINGS OF THE 2024 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, IH&MMSEC 2024 | 2024年

关键词：

Audio Deepfakes; Feature Concatenation; MFCC; CQCC; CQT; Mel spectrograms; CNN; BiLSTM;

D O I：

10.1145/3658664.3659647

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Audio deepfake detection is emerging as a crucial field in digital media, as distinguishing real audio from deepfakes becomes increasingly challenging due to the advancement of deepfake technologies. These methods threaten information authenticity and pose serious security risks. Addressing this challenge, we propose a novel architecture that combines Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) for effective deepfake audio detection. Our approach is distinguished by the feature concatenation of a comprehensive set of acoustic features: Mel Frequency Cepstral Coefficients (MFCC), Mel spectrograms, Constant Q Cepstral Coefficients (CQCC), and Constant-Q Transform (CQT) vectors. In the proposed architecture, features processed by a CNN are concatenated into two multi-dimensional features for comprehensive analysis, then analyzed by a BiLSTM network to capture temporal dynamics and contextual dependencies in audio data. This synergistic method ensures an understanding of both spatial and sequential audio characteristics. We validate our model on the ASVSpoof 2019 and FoR datasets, using accuracy and Equal Error Rate (EER) metrics for the evaluation.

引用

页码：271 / 276

页数：6

共 50 条

[1] Multi-Feature Concatenation Network for Object Detection
Yang A.
Lu L.
Ji Z.
Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2020, 53 (06): : 647 - 652
[2] Poster:Research on multi-feature fusion false review detection based on DistilBERT-BiLSTM-CNN
Zhang, Jing
Huang, Ke
2024 IEEE COUPLING OF SENSING & COMPUTING IN AIOT SYSTEMS, CSCAIOT 2024, 2024, : 21 - 22
[3] A WOA-CNN-BiLSTM-based multi-feature classification prediction model for smart grid financial markets
Ni, Guofeng
Zhang, Xiaoyuan
Ni, Xiang
Cheng, Xiaomei
Meng, Xiangdong
FRONTIERS IN ENERGY RESEARCH, 2023, 11
[4] Estimating the Material Footprint at the National Level from 1993 to 2022 Based on Multi-Feature CNN-BiLSTM
Miao, Lizhi
Wang, Yannan
Wu, Kaiwen
Huang, Lei
Kwan, Mei-Po
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2025, 14 (02)
[5] Lithium-Ion Battery SOH Estimation Method Based on Multi-Feature and CNN-BiLSTM-MHA
Zhou, Yujie
Zhang, Chaolong
Zhang, Xulong
Zhou, Ziheng
WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (07):
[6] Chinese Event Detection Based on Multi-Feature Fusion and BiLSTM
Xu, Guixian
Meng, Yueting
Zhou, Xiaokai
Yu, Ziheng
Wu, Xu
Zhang, Lijun
IEEE ACCESS, 2019, 7 : 134992 - 135004
[7] A Speech Steganalysis Algorithm Based on Multi-Feature Fusion and BiLSTM
Su Z.-P.
Zhang L.
Zhang G.-F.
Yue F.
Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2023, 51 (05): : 1300 - 1309
[8] Multi-Feature Audio-Visual Person Recognition
Das, Amitav
Manyam, Ohil K.
Tapaswi, Makarand
2008 IEEE WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2008, : 227 - 232
[9] Investigating Multi-Feature Selection and Ensembling for Audio Classification
Turab, Muhammad
Kumar, Teerath
Bendechache, Malika
Saber, Takfarinas
arXiv, 2022,
[10] Integrating Multi-feature of Image Based on Correspondence Analysis
Dai Fang
He Haimei
Han Wei
ICIEA 2010: PROCEEDINGS OF THE 5TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOL 3, 2010, : 480 - +

← 1 2 3 4 5 →