A Robust Framework For Acoustic Scene Classification

被引:19
|
作者
Lam Pham [1 ]
McLoughlin, Ian [1 ]
Huy Phan [1 ]
Palaniappan, Ramaswamy [1 ]
机构
[1] Univ Kent, Sch Comp, Medway, Kent, England
来源
INTERSPEECH 2019 | 2019年
关键词
Machine hearing; acoustic scene classification; convolutional neural network; deep neural network; spectrogram; log-Mel; Gammatone filter; constant Q transform;
D O I
10.21437/Interspeech.2019-1841
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Acoustic scene classification (ASC) using front-end time-frequency features and back-end neural network classifiers has demonstrated good performance in recent years. However a profusion of systems has arisen to suit different tasks and datasets, utilising different feature and classifier types. This paper aims at a robust framework that can explore and utilise a range of different time-frequency features and neural networks, either singly or merged, to achieve good classification performance. In particular, we exploit three different types of front-end time-frequency feature; log energy Mel filter, Gammatone filter and constant Q transform. At the back-end we evaluate effective a two-stage model that exploits a Convolutional Neural Network for pre-trained feature extraction, followed by Deep Neural Network classifiers as a post-trained feature adaptation model and classifier. We also explore the use of a data augmentation technique for these features that effectively generates a variety of intermediate data, reinforcing model learning abilities, particularly for marginal cases. We assess performance on the DCASE2016 dataset, demonstrating good classification accuracies exceeding 90%, significantly outperforming the DCASE2016 baseline and highly competitive compared to state-of-the-art systems.
引用
收藏
页码:3634 / 3638
页数:5
相关论文
共 50 条
  • [31] Deep Scalogram Representations for Acoustic Scene Classification
    Zhao Ren
    Kun Qian
    Zixing Zhang
    Vedhas Pandit
    Alice Baird
    Bjrn Schuller
    IEEE/CAA Journal of Automatica Sinica, 2018, 5 (03) : 662 - 669
  • [32] Deep semantic learning for acoustic scene classification
    Shao, Yun-Fei
    Ma, Xin-Xin
    Ma, Yong
    Zhang, Wei-Qiang
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2024, 2024 (01)
  • [33] Deep Segment Model for Acoustic Scene Classification
    Wang, Yajian
    Du, Jun
    Chen, Hang
    Wang, Qing
    Lee, Chin-Hui
    INTERSPEECH 2022, 2022, : 4177 - 4181
  • [34] A survey on preprocessing and classification techniques for acoustic scene
    Singh, Vikash Kumar
    Sharma, Kalpana
    Sur, Samarendra Nath
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 229
  • [35] Acoustic Scene Classification from Few Examples
    Bocharov, Ivan
    Tjalkens, Tjalling
    de Vries, Bert
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 862 - 866
  • [36] RQNet: Residual Quaternion CNN for Performance Enhancement in Low Complexity and Device Robust Acoustic Scene Classification
    Madhu, Aswathy
    Suresh, K.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8780 - 8792
  • [37] Convolutional Recurrent Neural Network with Auxiliary Stream for Robust Variable-Length Acoustic Scene Classification
    Choi, Won-Gook
    Chang, Joon-Hyuk
    INTERSPEECH 2022, 2022, : 2418 - 2422
  • [38] A Hybrid Approach to Acoustic Scene Classification Based on Universal Acoustic Models
    Bai, Xue
    Du, Jun
    Wang, Zi-Rui
    Lee, Chin-Hui
    INTERSPEECH 2019, 2019, : 3619 - 3623
  • [39] Instance-level loss based multiple-instance learning framework for acoustic scene classification
    Choi, Won-Gook
    Chang, Joon-Hyuk
    Yang, Jae-Mo
    Moon, Han-Gil
    APPLIED ACOUSTICS, 2024, 216
  • [40] A hierarchical learning framework for seafloor scene classification
    Chen, Genlang
    Lai, Chengang
    Huang, Miaoqing
    Song, Guanghui
    INDIAN JOURNAL OF GEO-MARINE SCIENCES, 2017, 46 (07) : 1352 - 1357