A Method of Environmental Sound Classification Based on Residual Networks and Data Augmentation

被引:0
|
作者
Zeng, Jinfang [1 ]
Li, Youming [1 ]
Zhang, Yu [1 ]
Chen, Da [1 ]
机构
[1] Xiang Tan Univ, Sch Phys & Optoelect, Xiangtan 411105, Hunan, Peoples R China
关键词
Environmental sound classification; residual networks; data augmentation;
D O I
10.1142/S1469026821500188
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Environmental sound classication (ESC) is a challenging problem due to the complexity of sounds. To date, a variety of signal processing and machine learning techniques have been applied to ESC task, including matrix factorization, dictionary learning, waveletlterbanks and deep neural networks. It is observed that features extracted from deeper networks tend to achieve higher performance than those extracted from shallow networks. However, in ESC task, only the deep convolutional neural networks (CNNs) which contain several layers are used and the residual networks are ignored, which lead to degradation in the performance. Meanwhile, a possible explanation for the limited exploration of CNNs and the diffculty to improve on simpler models is the relative scarcity of labeled data for ESC. In this paper, a residual network called EnvResNet for the ESC task is proposed. In addition, we propose to use audio data augmentation to overcome the problem of data scarcity. The experiments will be performed on the ESC-50 database. Combined with data augmentation, the proposed model outperforms baseline implementations relying on mel-frequency cepstral coeffcients and achieves results comparable to other state-of-the-art approaches in terms of classifcation accuracy.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification
    Salamon, Justin
    Bello, Juan Pablo
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (03) : 279 - 283
  • [2] METRIC LEARNING BASED DATA AUGMENTATION FOR ENVIRONMENTAL SOUND CLASSIFICATION
    Lu, Rui
    Duan, Zhiyao
    Zhang, Changshui
    2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 1 - 5
  • [3] Environmental Sound Classification using Deep Convolutional Neural Networks and Data Augmentation
    Davis, Nithya
    Suresh, K.
    2018 IEEE RECENT ADVANCES IN INTELLIGENT COMPUTATIONAL SYSTEMS (RAICS), 2018, : 41 - 45
  • [4] Data augmentation guided knowledge distillation for environmental sound classification
    Tripathi, Achyut Mani
    Paul, Konark
    NEUROCOMPUTING, 2022, 489 : 59 - 77
  • [5] Spectral images based environmental sound classification using CNN with meaningful data augmentation
    Mushtaq, Zohaib
    Su, Shun-Feng
    Quoc-Viet Tran
    APPLIED ACOUSTICS, 2021, 172
  • [6] Data Augmentation Using Generative Adversarial Network for Environmental Sound Classification
    Madhu, Aswathy
    Kumaraswamy, Suresh
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [7] EnvGAN: a GAN-based augmentation to improve environmental sound classification
    Aswathy Madhu
    Suresh K.
    Artificial Intelligence Review, 2022, 55 : 6301 - 6320
  • [8] EnvGAN: a GAN-based augmentation to improve environmental sound classification
    Madhu, Aswathy
    Suresh, K.
    ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (08) : 6301 - 6320
  • [9] PCGmix: A Data-Augmentation Method for Heart-Sound Classification
    Susic, David
    Gradisek, Anton
    Gams, Matjaz
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (11) : 6874 - 6885
  • [10] Data Augmentation and the Improvement of the Performance of Convolutional Neural Networks for Heart Sound Classification
    Takezaki, Shumpei
    Kishida, Kazuya
    IAENG International Journal of Computer Science, 2022, 49 (04)