A Robust Framework For Acoustic Scene Classification

被引：19

作者：

Lam Pham ^{[1
]}

McLoughlin, Ian ^{[1
]}

Huy Phan ^{[1
]}

Palaniappan, Ramaswamy ^{[1
]}

机构：

[1] Univ Kent, Sch Comp, Medway, Kent, England

来源：

INTERSPEECH 2019 | 2019年

关键词：

Machine hearing; acoustic scene classification; convolutional neural network; deep neural network; spectrogram; log-Mel; Gammatone filter; constant Q transform;

D O I：

10.21437/Interspeech.2019-1841

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

Acoustic scene classification (ASC) using front-end time-frequency features and back-end neural network classifiers has demonstrated good performance in recent years. However a profusion of systems has arisen to suit different tasks and datasets, utilising different feature and classifier types. This paper aims at a robust framework that can explore and utilise a range of different time-frequency features and neural networks, either singly or merged, to achieve good classification performance. In particular, we exploit three different types of front-end time-frequency feature; log energy Mel filter, Gammatone filter and constant Q transform. At the back-end we evaluate effective a two-stage model that exploits a Convolutional Neural Network for pre-trained feature extraction, followed by Deep Neural Network classifiers as a post-trained feature adaptation model and classifier. We also explore the use of a data augmentation technique for these features that effectively generates a variety of intermediate data, reinforcing model learning abilities, particularly for marginal cases. We assess performance on the DCASE2016 dataset, demonstrating good classification accuracies exceeding 90%, significantly outperforming the DCASE2016 baseline and highly competitive compared to state-of-the-art systems.

引用

页码：3634 / 3638

页数：5

共 50 条

[31] Deep Scalogram Representations for Acoustic Scene Classification
Zhao Ren
Kun Qian
Zixing Zhang
Vedhas Pandit
Alice Baird
Bjrn Schuller
IEEE/CAA Journal of Automatica Sinica, 2018, 5 (03) : 662 - 669
[32] Deep semantic learning for acoustic scene classification
Shao, Yun-Fei
Ma, Xin-Xin
Ma, Yong
Zhang, Wei-Qiang
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
[33] Deep Segment Model for Acoustic Scene Classification
Wang, Yajian
Du, Jun
Chen, Hang
Wang, Qing
Lee, Chin-Hui
INTERSPEECH 2022, 2022, : 4177 - 4181
[34] A survey on preprocessing and classification techniques for acoustic scene
Singh, Vikash Kumar
Sharma, Kalpana
Sur, Samarendra Nath
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
[35] Acoustic Scene Classification from Few Examples
Bocharov, Ivan
Tjalkens, Tjalling
de Vries, Bert
2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 862 - 866
[36] RQNet: Residual Quaternion CNN for Performance Enhancement in Low Complexity and Device Robust Acoustic Scene Classification
Madhu, Aswathy
Suresh, K.
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8780 - 8792
[37] Convolutional Recurrent Neural Network with Auxiliary Stream for Robust Variable-Length Acoustic Scene Classification
Choi, Won-Gook
Chang, Joon-Hyuk
INTERSPEECH 2022, 2022, : 2418 - 2422
[38] A Hybrid Approach to Acoustic Scene Classification Based on Universal Acoustic Models
Bai, Xue
Du, Jun
Wang, Zi-Rui
Lee, Chin-Hui
INTERSPEECH 2019, 2019, : 3619 - 3623
[39] Instance-level loss based multiple-instance learning framework for acoustic scene classification
Choi, Won-Gook
Chang, Joon-Hyuk
Yang, Jae-Mo
Moon, Han-Gil
APPLIED ACOUSTICS, 2024, 216
[40] A hierarchical learning framework for seafloor scene classification
Chen, Genlang
Lai, Chengang
Huang, Miaoqing
Song, Guanghui
INDIAN JOURNAL OF GEO-MARINE SCIENCES, 2017, 46 (07) : 1352 - 1357

← 1 2 3 4 5 →