Audio classification using braided convolutional neural networks

被引:15
|
作者
Sinha, Harsh [1 ]
Awasthi, Vinayak [2 ]
Ajmera, Pawan K. [2 ]
机构
[1] Birla Inst Technol & Sci Pilani, Dept Comp Sci & Informat Syst, Pilani 333031, Rajasthan, India
[2] Birla Inst Technol & Sci Pilani, Dept Elect & Elect Engn, Pilani 333031, Rajasthan, India
关键词
hidden Markov models; neurophysiology; image classification; learning (artificial intelligence); convolutional neural nets; image representation; braided convolutional neural network; deep neural networks; CNN-based neural architecture; audio classification tasks; CNN architecture; deep learning architectures; sparse representation; receptive neurons; primary auditory cortex; Google Speech Commands datasets; UrbanSound8K dataset; spectrogram images; FEATURES; RECOGNITION;
D O I
10.1049/iet-spr.2019.0381
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional neural networks (CNNs) work surprisingly well and have helped drastically enhance the state-of-the-art techniques in the domain of image classification. The unprecedented success motivated the application of CNNs to the domain of auditory data. Recent publications suggest hidden Markov models and deep neural networks for audio classification. This study aims to achieve audio classification by representing audio as spectrogram images and then use a CNN-based architecture for classification. This study presents an innovative strategy for a CNN-based neural architecture that learns a sparse representation imitating the receptive neurons in the primary auditory cortex in mammals. The feasibility of the proposed CNN-based neural architecture is assessed for audio classification tasks on standard benchmark datasets such as Google Speech Commands datasets (GSCv1 and GSCv2) and the UrbanSound8K dataset (US8K). The proposed CNN architecture, referred to as braided convolutional neural network, achieves 97.15, 95 and 91.9% average recognition accuracy on GSCv1, GSCv2 and US8 K datasets, respectively, outperforming other deep learning architectures.
引用
收藏
页码:448 / 454
页数:7
相关论文
共 50 条
  • [1] An Ensemble of Convolutional Neural Networks for Audio Classification
    Nanni, Loris
    Maguolo, Gianluca
    Brahnam, Sheryl
    Paci, Michelangelo
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (13):
  • [2] Ensemble of convolutional neural networks to improve animal audio classification
    Loris Nanni
    Yandre M. G. Costa
    Rafael L. Aguiar
    Rafael B. Mangolin
    Sheryl Brahnam
    Carlos N. Silla
    [J]. EURASIP Journal on Audio, Speech, and Music Processing, 2020
  • [3] A Convolutional Neural Networks Approach to Audio Classification for Rainfall Estimation
    Avanzato, Roberta
    Beritelli, Francesco
    Di Franco, Francesco
    Puglisi, Valerio Francesco
    [J]. PROCEEDINGS OF THE 2019 10TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS - TECHNOLOGY AND APPLICATIONS (IDAACS), VOL. 1, 2019, : 285 - 289
  • [4] Ensemble of convolutional neural networks to improve animal audio classification
    Nanni, Loris
    Costa, Yandre M. G.
    Aguiar, Rafael L.
    Mangolin, Rafael B.
    Brahnam, Sheryl
    Silla Jr, Carlos N.
    [J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2020, 2020 (01)
  • [5] Audio Signals Encoding for Cough Classification Using Convolutional Neural Networks: A Comparative Study
    Wang, Hui-Hui
    Liu, Jia-Ming
    You, Mingyu
    Li, Guo-Zheng
    [J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 442 - 445
  • [6] What Affects the Performance of Convolutional Neural Networks for Audio Event Classification
    Wang, Helin
    Chong, Dading
    Huang, Dongyan
    Zou, Yuexian
    [J]. 2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2019, : 140 - 146
  • [7] Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks
    Elias, Noel
    [J]. 2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 693 - 698
  • [8] Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
    Sharan, Roneel, V
    Xiong, Hao
    Berkovsky, Shlomo
    [J]. SENSORS, 2021, 21 (10)
  • [9] "Seeing Sound": Audio Classification Using theWigner-Ville Distribution and Convolutional Neural Networks
    Marios, Christonasis Antonios
    van Eijndhoven, Stef
    Duin, Peter
    [J]. INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, INTELLISYS 2023, 2024, 822 : 145 - 155
  • [10] Sound Classification Using Convolutional Neural Networks
    Jaiswal, Kaustumbh
    Patel, Dhairya Kalpeshbhai
    [J]. 2018 SEVENTH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), 2018, : 81 - 84