An efficient low-perceptual environmental sound classification adversarial method based on GAN

被引：0

作者：

Zhang, Qiang ^{[1
]}

Yang, Jibin ^{[2
]}

Zhang, Xiongwei ^{[2
]}

Cao, Tieyong ^{[2
]}

机构：

[1] Army Engn Univ, Grad Sch, Nanjing 210007, Peoples R China

[2] Army Engn Univ, Command & Control Engn Coll, Nanjing 210007, Peoples R China

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2024年 / 83卷 / 34期

基金：

中国国家自然科学基金;

关键词：

Environmental sound classification; Deep learning; Generative Adversarial Network; Short-time spectrum; Adversarial example;

D O I：

10.1007/s11042-024-18318-5

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

By incorporating additive perturbations to real samples, adversarial examples have notably exhibited the capability to deceive deep neural networks. Although the existing audio adversarial methods can successfully attack environmental sound classification (ESC) models, these generated perturbations can be easily perceived by humans. And the perturbations cannot be generated efficiently because of the large adversarial perturbation search space in audio. To address the problems, this paper proposes a Short-time Spectrum Generative Adversarial Network-based (StS-GAN) attack method to improve the performance of generated adversaries. In this method, a GAN is implemented to generate the magnitude spectrum perturbations with real signal magnitude spectra as inputs, and adversarial magnitude spectra are obtained as the superposition of the real signal magnitude spectra and the perturbations. Additionally, a short-time processing scheme is adopted to flexibly adjust the input length of the generator to balance computational complexity and attack performance. Through adversarial training, StS-GAN learns to generate adversarial examples with temporal-spectral characteristics similar to those of real signals. The learned perturbations tend to have smaller energies, making them less significant and less distinguishable by human perception. Thorough experiments show that, compared to existing adversarial attack methods, the proposed method achieves a higher Attack Success Rate (ASR) and efficiency, and the generated perturbations are less likely to be perceived by humans. The average ASR reaches 97% while maintaining a mean energy ratio of above 30 dB between the real signal and the generated perturbation, demonstrating the effectiveness of the proposed method.

引用

页码：80847 / 80872

页数：26

共 50 条

[31] SenAttack: adversarial attack method based on perturbation sensitivity and perceptual color distance
Sun, Jiaze
Long, Siyuan
Ma, Xianyan
[J]. APPLIED INTELLIGENCE, 2023, 53 (23) : 28937 - 28953
[32] Efficient deep neural network compression for environmental sound classification on microcontroller units
Chen, Shan
Meng, Na
Li, Haoyuan
Fang, Weiwei
[J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2024, 32 (04)
[33] ESResNet: Environmental Sound Classification Based on Visual Domain Models
Guzhov, Andrey
Raue, Federico
Hees, Jorn
Dengel, Andreas
[J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4933 - 4940
[34] Environmental Sound Classification based on Time-frequency Representation
Thwe, Khine Zar
War, Nu
[J]. 2017 18TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNDP 2017), 2017, : 251 - 255
[35] Combining frame and segment based models for environmental sound classification
Hu, Pengfei
Liu, Wenju
Jiang, Wei
[J]. 13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3, 2012, : 2501 - 2504
[36] METRIC LEARNING BASED DATA AUGMENTATION FOR ENVIRONMENTAL SOUND CLASSIFICATION
Lu, Rui
Duan, Zhiyao
Zhang, Changshui
[J]. 2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 1 - 5
[37] Ensemble Classification Based on Feature Selection for Environmental Sound Recognition
Zhao, Shuai
Zhang, Yan
Xu, Haifeng
Han, Te
[J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
[38] An efficient lung sound classification technique based on MFCC and HDMR
Arar, Mahmud Esad
Sedef, Herman
[J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (08) : 4385 - 4394
[39] An efficient lung sound classification technique based on MFCC and HDMR
Mahmud Esad Arar
Herman Sedef
[J]. Signal, Image and Video Processing, 2023, 17 : 4385 - 4394
[40] A Classification Method for Unrecognized Spatial Disorientation Based on Perceptual Process
Hao, Chenru
Fan, Xiaoya
Dong, Chunnan
Qiao, Lihua
Li, Xinwei
Li, Xiuyuan
Cheng, Li
Guo, Lisha
Zhao, Ruibin
[J]. IEEE ACCESS, 2020, 8 (08): : 140654 - 140660

← 1 2 3 4 5 →