A Robust Framework For Acoustic Scene Classification

被引：19

作者：

Lam Pham ^{[1
]}

McLoughlin, Ian ^{[1
]}

Huy Phan ^{[1
]}

Palaniappan, Ramaswamy ^{[1
]}

机构：

[1] Univ Kent, Sch Comp, Medway, Kent, England

来源：

INTERSPEECH 2019 | 2019年

关键词：

Machine hearing; acoustic scene classification; convolutional neural network; deep neural network; spectrogram; log-Mel; Gammatone filter; constant Q transform;

D O I：

10.21437/Interspeech.2019-1841

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Acoustic scene classification (ASC) using front-end time-frequency features and back-end neural network classifiers has demonstrated good performance in recent years. However a profusion of systems has arisen to suit different tasks and datasets, utilising different feature and classifier types. This paper aims at a robust framework that can explore and utilise a range of different time-frequency features and neural networks, either singly or merged, to achieve good classification performance. In particular, we exploit three different types of front-end time-frequency feature; log energy Mel filter, Gammatone filter and constant Q transform. At the back-end we evaluate effective a two-stage model that exploits a Convolutional Neural Network for pre-trained feature extraction, followed by Deep Neural Network classifiers as a post-trained feature adaptation model and classifier. We also explore the use of a data augmentation technique for these features that effectively generates a variety of intermediate data, reinforcing model learning abilities, particularly for marginal cases. We assess performance on the DCASE2016 dataset, demonstrating good classification accuracies exceeding 90%, significantly outperforming the DCASE2016 baseline and highly competitive compared to state-of-the-art systems.

引用

页码：3634 / 3638

页数：5

共 50 条

[41] CAA-Net: Conditional Atrous CNNs With Attention for Explainable Device-Robust Acoustic Scene Classification
Ren, Zhao
Kong, Qiuqiang
Han, Jing
Plumbley, Mark D.
Schuller, Bjoern W.
IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 4131 - 4142
[42] Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network
Venkatesh, Spoorthy
Koolagudi, Shashidhar G.
SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 688 - 699
[43] Shallow Convolutional Neural Networks for Acoustic Scene Classification
LU Lu
YANG Yuhong
JIANG Yuzhi
AI Haojun
TU Weiping
WuhanUniversityJournalofNaturalSciences, 2018, 23 (02) : 178 - 184
[44] Deep Neural Decision Forest for Acoustic Scene Classification
Sun, Jianyuan
Liu, Xubo
Mei, Xinhao
Zhao, Jinzheng
Plumbley, Mark D.
Kilic, Volkan
Wang, Wenwu
2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 772 - 776
[45] PROTOTYPICAL NETWORKS FOR DOMAIN ADAPTATION IN ACOUSTIC SCENE CLASSIFICATION
Singh, Shubhr
Bear, Helen L.
Benetos, Emmanouil
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 346 - 350
[46] Acoustic Scene Classification using Deep Fisher network
Venkatesh, Spoorthy
Mulimani, Manjunath
Koolagudi, Shashidhar G.
DIGITAL SIGNAL PROCESSING, 2023, 139
[47] A DATABASE AND CHALLENGE FOR ACOUSTIC SCENE CLASSIFICATION AND EVENT DETECTION
Giannoulis, Dimitrios
Stowell, Dan
Benetos, Emmanouil
Rossignol, Mathias
Lagrange, Mathieu
Plumbley, Mark D.
2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
[48] Deep mutual attention network for acoustic scene classification
Xie, Wei
He, Qianhua
Yu, Zitong
Li, Yanxiong
Digital Signal Processing: A Review Journal, 2022, 123
[49] Acoustic Scene Classification Using Reduced MobileNet Architecture
Xu, Jun-Xiang
Lin, Tzu-Ching
Yu, Tsai-Ching
Tai, Tzu-Chiang
Chang, Pao-Chi
2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018), 2018, : 267 - 270
[50] An Investigation of Transfer Learning Mechanism for Acoustic Scene Classification
Zhou, Hengshun
Bai, Xue
Du, Jun
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 404 - 408

← 1 2 3 4 5 →