An Ensemble of Convolutional Neural Networks for Audio Classification

被引:47
|
作者
Nanni, Loris [1 ]
Maguolo, Gianluca [1 ]
Brahnam, Sheryl [2 ]
Paci, Michelangelo [3 ]
机构
[1] Univ Padua, Dept Informat Engn, I-35122 Padua, Italy
[2] Missouri State Univ, Dept Informat Technol & Cybersecur, Springfield, MO 65804 USA
[3] Tampere Univ, Fac Med & Hlth Technol, BioMediTech, Arvo Ylpon Katu 34, FI-33520 Tampere, Finland
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 13期
关键词
audio classification; data augmentation; ensemble of classifiers; pattern recognition; TIME-SCALE MODIFICATION; TEXTURE CLASSIFICATION; ACOUSTIC FEATURES; DATA AUGMENTATION;
D O I
10.3390/app11135796
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Research in sound classification and recognition is rapidly advancing in the field of pattern recognition. One important area in this field is environmental sound recognition, whether it concerns the identification of endangered species in different habitats or the type of interfering noise in urban environments. Since environmental audio datasets are often limited in size, a robust model able to perform well across different datasets is of strong research interest. In this paper, ensembles of classifiers are combined that exploit six data augmentation techniques and four signal representations for retraining five pre-trained convolutional neural networks (CNNs); these ensembles are tested on three freely available environmental audio benchmark datasets: (i) bird calls, (ii) cat sounds, and (iii) the Environmental Sound Classification (ESC-50) database for identifying sources of noise in environments. To the best of our knowledge, this is the most extensive study investigating ensembles of CNNs for audio classification. The best-performing ensembles are compared and shown to either outperform or perform comparatively to the best methods reported in the literature on these datasets, including on the challenging ESC-50 dataset. We obtained a 97% accuracy on the bird dataset, 90.51% on the cat dataset, and 88.65% on ESC-50 using different approaches. In addition, the same ensemble model trained on the three datasets managed to reach the same results on the bird and cat datasets while losing only 0.1% on ESC-50. Thus, we have managed to create an off-the-shelf ensemble that can be trained on different datasets and reach performances competitive with the state of the art.
引用
下载
收藏
页数:18
相关论文
共 50 条
  • [1] Ensemble of convolutional neural networks to improve animal audio classification
    Loris Nanni
    Yandre M. G. Costa
    Rafael L. Aguiar
    Rafael B. Mangolin
    Sheryl Brahnam
    Carlos N. Silla
    EURASIP Journal on Audio, Speech, and Music Processing, 2020
  • [2] Ensemble of convolutional neural networks to improve animal audio classification
    Nanni, Loris
    Costa, Yandre M. G.
    Aguiar, Rafael L.
    Mangolin, Rafael B.
    Brahnam, Sheryl
    Silla Jr, Carlos N.
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2020, 2020 (01)
  • [3] Reliable Classification with Ensemble Convolutional Neural Networks
    Gao, Zhen
    Zhang, Han
    Wei, Xiaohui
    Yan, Tong
    Guo, Kangkang
    Li, Wenshuo
    Wang, Yu
    Reviriego, Pedro
    2020 33RD IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI AND NANOTECHNOLOGY SYSTEMS (DFT), 2020,
  • [4] Ensemble of convolutional neural networks for bioimage classification
    Nanni, Loris
    Ghidon, Stefano
    Brahnam, Sheryl
    APPLIED COMPUTING AND INFORMATICS, 2021, 17 (01) : 19 - 35
  • [5] Audio classification using braided convolutional neural networks
    Sinha, Harsh
    Awasthi, Vinayak
    Ajmera, Pawan K.
    IET SIGNAL PROCESSING, 2020, 14 (07) : 448 - 454
  • [6] A Convolutional Neural Networks Approach to Audio Classification for Rainfall Estimation
    Avanzato, Roberta
    Beritelli, Francesco
    Di Franco, Francesco
    Puglisi, Valerio Francesco
    PROCEEDINGS OF THE 2019 10TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS - TECHNOLOGY AND APPLICATIONS (IDAACS), VOL. 1, 2019, : 285 - 289
  • [7] Hydra: An Ensemble of Convolutional Neural Networks for Geospatial Land Classification
    Minetto, Rodrigo
    Segundo, Mauricio Pamplona
    Sarkar, Sudeep
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (09): : 6530 - 6541
  • [8] An Ensemble of Convolutional Neural Networks for Image Classification Based on LSTM
    Chen, JingLin
    Wang, YiLei
    Wu, YingJie
    Cai, ChaoQuan
    2017 INTERNATIONAL CONFERENCE ON GREEN INFORMATICS (ICGI), 2017, : 217 - 222
  • [9] Ensemble Convolutional Neural Networks for Cell Classification in Microscopic Images
    Shi, Tian
    Wu, Longshi
    Zhong, Changhong
    Wang, Ruixuan
    Zheng, Weishi
    ISBI 2019 C-NMC CHALLENGE: CLASSIFICATION IN CANCER CELL IMAGING, 2019, : 43 - 51
  • [10] Multiscale ensemble of convolutional neural networks for skin lesion classification
    Liu, Yi-Peng
    Wang, Ziming
    Li, Zhanqing
    Li, Jing
    Li, Ting
    Chen, Peng
    Liang, Ronghua
    IET IMAGE PROCESSING, 2021, 15 (10) : 2309 - 2318