EnvGAN: a GAN-based augmentation to improve environmental sound classification

被引:8
|
作者
Madhu, Aswathy [1 ]
Suresh, K. [2 ]
机构
[1] APJ Abdul Kalam Technol Univ, Coll Engn, Dept Elect, Trivandrum, Kerala, India
[2] APJ Abdul Kalam Technol Univ, Govt Engn Coll, Dept Elect, Wayanad, India
关键词
Generative adversarial network; Environmental sound classification; Data augmentation; Convolutional neural network; Deep learning; UrbanSound8K; ESC-10; GENERATIVE ADVERSARIAL NETWORKS; RECOGNITION; SPEECH;
D O I
10.1007/s10462-022-10153-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Several deep learning algorithms have emerged for the automatic classification of environmental sounds. However, the non-availability of adequate labeled data for training limits the performance of these algorithms. Data augmentation is an appropriate solution to this problem. Generative Adversarial Networks (GANs) can successfully generate synthetic speech and sounds of musical instruments for classification applications. In this paper, we present a method for GAN-based augmentation in the context of environmental sound classification. We introduce an architecture named EnvGAN for the adversarial generation of environmental sounds. To validate the quality of the generated sounds, we have conducted subjective and objective evaluations. The results indicate that EnvGAN can produce samples of various domains with an acceptable target quality. We applied this augmentation technique on three benchmark ESC datasets (ESC-10, UrbanSound8K, and TUT Urban Acoustic Scenes development dataset) and used it for training a CNN-based classifier. Experimental results show that this new augmentation method can outperform a baseline method with no augmentation by a relatively wide margin (10-12% on ESC-10, 5-7% on UrbanSound8K, and 4-5% on TUT). In particular, the GAN-based approach reduces the confusion between all pairs of classes on UrbanSound8K. That is, the proposed method is especially suitable for handling class-imbalanced datasets.
引用
收藏
页码:6301 / 6320
页数:20
相关论文
共 50 条
  • [1] EnvGAN: a GAN-based augmentation to improve environmental sound classification
    Aswathy Madhu
    Suresh K.
    [J]. Artificial Intelligence Review, 2022, 55 : 6301 - 6320
  • [2] Gan-based data augmentation to improve breast ultrasound and mammography mass classification
    Jimenez-Gaona, Yuliana
    Carrion-Figueroa, Diana
    Lakshminarayanan, Vasudevan
    Rodriguez-Alvarez, Maria Jose
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 94
  • [3] GAN-Based Data Augmentation For Improving The Classification Of EEG Signals
    Bhat, Sudhanva
    Hortal, Enrique
    [J]. THE 14TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2021, 2021, : 453 - 458
  • [4] METRIC LEARNING BASED DATA AUGMENTATION FOR ENVIRONMENTAL SOUND CLASSIFICATION
    Lu, Rui
    Duan, Zhiyao
    Zhang, Changshui
    [J]. 2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 1 - 5
  • [5] Combinatorial Adversarial Defense for Environmental Sound Classification Based on GAN
    Zhang, Qiang
    Yang, Jibin
    Zhang, Xiongwei
    Cao, Tieyong
    Li, Yihao
    [J]. Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2023, 45 (12): : 4399 - 4410
  • [6] Enhancement of Image Classification Using Transfer Learning and GAN-Based Synthetic Data Augmentation
    Chatterjee, Subhajit
    Hazra, Debapriya
    Byun, Yung-Cheol
    Kim, Yong-Woon
    [J]. MATHEMATICS, 2022, 10 (09)
  • [7] A Method of Environmental Sound Classification Based on Residual Networks and Data Augmentation
    Zeng, Jinfang
    Li, Youming
    Zhang, Yu
    Chen, Da
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2021, 20 (03)
  • [8] A GAN-Based Data Augmentation Method for Imbalanced Multi-Class Skin Lesion Classification
    Su, Qichen
    Hamed, Haza Nuzly Abdull
    Isa, Mohd Adham
    Hao, Xue
    Dai, Xin
    [J]. IEEE ACCESS, 2024, 12 : 16498 - 16513
  • [9] Improving Heart Rate Range Classification Using Doppler Radar with GAN-based Data Augmentation
    Yu, Danyuan
    Bouazizi, Mondher
    Ohtsuki, Tomoaki
    [J]. IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 3885 - 3890
  • [10] GAN-based one dimensional medical data augmentation
    Ye Zhang
    Zhixiang Wang
    Zhen Zhang
    Junzhuo Liu
    Ying Feng
    Leonard Wee
    Andre Dekker
    Qiaosong Chen
    Alberto Traverso
    [J]. Soft Computing, 2023, 27 : 10481 - 10491