Audio Anti-Spoofing Based on Audio Feature Fusion

被引:1
|
作者
Zhang, Jiachen [1 ]
Tu, Guoqing [1 ]
Liu, Shubo [2 ]
Cai, Zhaohui [2 ]
机构
[1] Wuhan Univ, Sch Cyber Sci & Engn, Key Lab Aerosp Informat Secur & Trusted Comp, Minist Educ, Wuhan 430072, Peoples R China
[2] Wuhan Univ, Sch Comp Sci, Wuhan 430072, Peoples R China
关键词
deep learning; wav2vec; 2; 0; automatic speaker verification; deep-fake detection; ASVspoof Challenge;
D O I
10.3390/a16070317
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The rapid development of speech synthesis technology has significantly improved the naturalness and human-likeness of synthetic speech. As the technical barriers for speech synthesis are rapidly lowering, the number of illegal activities such as fraud and extortion is increasing, posing a significant threat to authentication systems, such as automatic speaker verification. This paper proposes an end-to-end speech synthesis detection model based on audio feature fusion in response to the constantly evolving synthesis techniques and to improve the accuracy of detecting synthetic speech. The model uses a pre-trained wav2vec2 model to extract features from raw waveforms and utilizes an audio feature fusion module for back-end classification. The audio feature fusion module aims to improve the model accuracy by adequately utilizing the audio features extracted from the front end and fusing the information from timeframes and feature dimensions. Data augmentation techniques are also used to enhance the performance generalization of the model. The model is trained on the training and development sets of the logical access (LA) dataset of the ASVspoof 2019 Challenge, an international standard, and is tested on the logical access (LA) and deep-fake (DF) evaluation datasets of the ASVspoof 2021 Challenge. The equal error rate (EER) on ASVspoof 2021 LA and ASVspoof 2021 DF are 1.18% and 2.62%, respectively, achieving the best results on the DF dataset.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Robust Audio Anti-Spoofing with Fusion-Reconstruction Learning on Multi-Order Spectrograms
    Wen, Penghui
    Hu, Kun
    Yue, Wenxi
    Zhang, Sen
    Zhou, Wanlei
    Wang, Zhiyong
    [J]. INTERSPEECH 2023, 2023, : 271 - 275
  • [2] LBP and CNN feature fusion for face anti-spoofing
    Ravi Pratap Singh
    Ratnakar Dash
    Ramesh Kumar Mohapatra
    [J]. Pattern Analysis and Applications, 2023, 26 : 773 - 782
  • [3] LBP and CNN feature fusion for face anti-spoofing
    Singh, Ravi Pratap
    Dash, Ratnakar
    Mohapatra, Ramesh Kumar
    [J]. PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (02) : 773 - 782
  • [4] ROBUST AUDIO ANTI-SPOOFING SYSTEM BASED ON LOW-FREQUENCY SUB-BAND INFORMATION
    Li, Menglu
    Zhang, Xiao-Ping
    [J]. 2023 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, WASPAA, 2023,
  • [5] Face Anti-spoofing Based on Specular Feature Projections
    Katika, Balaji Rao
    Karthik, Kannan
    [J]. PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 1, 2020, 1022 : 145 - 155
  • [6] Lightweight network-based multi-modal feature fusion for face anti-spoofing
    He, Dan
    He, Xiping
    Yuan, Rui
    Li, Yue
    Shen, Chao
    [J]. VISUAL COMPUTER, 2023, 39 (04): : 1423 - 1435
  • [7] Lightweight network-based multi-modal feature fusion for face anti-spoofing
    Dan He
    Xiping He
    Rui Yuan
    Yue Li
    Chao Shen
    [J]. The Visual Computer, 2023, 39 : 1423 - 1435
  • [8] AASIST: AUDIO ANTI-SPOOFING USING INTEGRATED SPECTRO-TEMPORAL GRAPH ATTENTION NETWORKS
    Jung, Jee-weon
    Heo, Hee-Soo
    Tak, Hemlata
    Shim, Hye-jin
    Chung, Joon Son
    Lee, Bong-Jin
    Yu, Ha-Jin
    Evans, Nicholas
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6367 - 6371
  • [9] A BLIND AUDIO STEGANALYSIS BASED ON FEATURE FUSION
    Wei Yifang Guo Li Wang Yujie Wang Cuiping (Department of Electronic Science and Technology
    [J]. Journal of Electronics(China), 2011, 28 (03) : 265 - 276
  • [10] Modified Cepstral Feature for Speech Anti-spoofing
    何明瑞
    ZAIDI Syed Faham Ali
    田娩鑫
    单志勇
    江政儒
    徐珑婷
    [J]. Journal of Donghua University(English Edition), 2023, 40 (02) : 193 - 201