Audio classification using braided convolutional neural networks

被引：15

作者：

Sinha, Harsh ^{[1
]}

Awasthi, Vinayak ^{[2
]}

Ajmera, Pawan K. ^{[2
]}

机构：

[1] Birla Inst Technol & Sci Pilani, Dept Comp Sci & Informat Syst, Pilani 333031, Rajasthan, India

[2] Birla Inst Technol & Sci Pilani, Dept Elect & Elect Engn, Pilani 333031, Rajasthan, India

来源：

IET SIGNAL PROCESSING | 2020年 / 14卷 / 07期

关键词：

hidden Markov models; neurophysiology; image classification; learning (artificial intelligence); convolutional neural nets; image representation; braided convolutional neural network; deep neural networks; CNN-based neural architecture; audio classification tasks; CNN architecture; deep learning architectures; sparse representation; receptive neurons; primary auditory cortex; Google Speech Commands datasets; UrbanSound8K dataset; spectrogram images; FEATURES; RECOGNITION;

D O I：

10.1049/iet-spr.2019.0381

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Convolutional neural networks (CNNs) work surprisingly well and have helped drastically enhance the state-of-the-art techniques in the domain of image classification. The unprecedented success motivated the application of CNNs to the domain of auditory data. Recent publications suggest hidden Markov models and deep neural networks for audio classification. This study aims to achieve audio classification by representing audio as spectrogram images and then use a CNN-based architecture for classification. This study presents an innovative strategy for a CNN-based neural architecture that learns a sparse representation imitating the receptive neurons in the primary auditory cortex in mammals. The feasibility of the proposed CNN-based neural architecture is assessed for audio classification tasks on standard benchmark datasets such as Google Speech Commands datasets (GSCv1 and GSCv2) and the UrbanSound8K dataset (US8K). The proposed CNN architecture, referred to as braided convolutional neural network, achieves 97.15, 95 and 91.9% average recognition accuracy on GSCv1, GSCv2 and US8 K datasets, respectively, outperforming other deep learning architectures.

引用

页码：448 / 454

页数：7

共 50 条

[1] An Ensemble of Convolutional Neural Networks for Audio Classification
Nanni, Loris
Maguolo, Gianluca
Brahnam, Sheryl
Paci, Michelangelo
[J]. APPLIED SCIENCES-BASEL, 2021, 11 (13):
[2] Ensemble of convolutional neural networks to improve animal audio classification
Loris Nanni
Yandre M. G. Costa
Rafael L. Aguiar
Rafael B. Mangolin
Sheryl Brahnam
Carlos N. Silla
[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2020
[3] A Convolutional Neural Networks Approach to Audio Classification for Rainfall Estimation
Avanzato, Roberta
Beritelli, Francesco
Di Franco, Francesco
Puglisi, Valerio Francesco
[J]. PROCEEDINGS OF THE 2019 10TH IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS - TECHNOLOGY AND APPLICATIONS (IDAACS), VOL. 1, 2019, : 285 - 289
[4] Ensemble of convolutional neural networks to improve animal audio classification
Nanni, Loris
Costa, Yandre M. G.
Aguiar, Rafael L.
Mangolin, Rafael B.
Brahnam, Sheryl
Silla Jr, Carlos N.
[J]. EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2020, 2020 (01)
[5] Audio Signals Encoding for Cough Classification Using Convolutional Neural Networks: A Comparative Study
Wang, Hui-Hui
Liu, Jia-Ming
You, Mingyu
Li, Guo-Zheng
[J]. PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 442 - 445
[6] What Affects the Performance of Convolutional Neural Networks for Audio Event Classification
Wang, Helin
Chong, Dading
Huang, Dongyan
Zou, Yuexian
[J]. 2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS (ACIIW), 2019, : 140 - 146
[7] Audio Classification of Low Feature Spectrograms Utilizing Convolutional Neural Networks
Elias, Noel
[J]. 2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 693 - 698
[8] Benchmarking Audio Signal Representation Techniques for Classification with Convolutional Neural Networks
Sharan, Roneel, V
Xiong, Hao
Berkovsky, Shlomo
[J]. SENSORS, 2021, 21 (10)
[9] "Seeing Sound": Audio Classification Using theWigner-Ville Distribution and Convolutional Neural Networks
Marios, Christonasis Antonios
van Eijndhoven, Stef
Duin, Peter
[J]. INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 1, INTELLISYS 2023, 2024, 822 : 145 - 155
[10] Sound Classification Using Convolutional Neural Networks
Jaiswal, Kaustumbh
Patel, Dhairya Kalpeshbhai
[J]. 2018 SEVENTH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING IN EMERGING MARKETS (CCEM), 2018, : 81 - 84

← 1 2 3 4 5 →