A Robust Framework For Acoustic Scene Classification

被引:19
|
作者
Lam Pham [1 ]
McLoughlin, Ian [1 ]
Huy Phan [1 ]
Palaniappan, Ramaswamy [1 ]
机构
[1] Univ Kent, Sch Comp, Medway, Kent, England
来源
INTERSPEECH 2019 | 2019年
关键词
Machine hearing; acoustic scene classification; convolutional neural network; deep neural network; spectrogram; log-Mel; Gammatone filter; constant Q transform;
D O I
10.21437/Interspeech.2019-1841
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Acoustic scene classification (ASC) using front-end time-frequency features and back-end neural network classifiers has demonstrated good performance in recent years. However a profusion of systems has arisen to suit different tasks and datasets, utilising different feature and classifier types. This paper aims at a robust framework that can explore and utilise a range of different time-frequency features and neural networks, either singly or merged, to achieve good classification performance. In particular, we exploit three different types of front-end time-frequency feature; log energy Mel filter, Gammatone filter and constant Q transform. At the back-end we evaluate effective a two-stage model that exploits a Convolutional Neural Network for pre-trained feature extraction, followed by Deep Neural Network classifiers as a post-trained feature adaptation model and classifier. We also explore the use of a data augmentation technique for these features that effectively generates a variety of intermediate data, reinforcing model learning abilities, particularly for marginal cases. We assess performance on the DCASE2016 dataset, demonstrating good classification accuracies exceeding 90%, significantly outperforming the DCASE2016 baseline and highly competitive compared to state-of-the-art systems.
引用
收藏
页码:3634 / 3638
页数:5
相关论文
共 50 条
  • [41] CAA-Net: Conditional Atrous CNNs With Attention for Explainable Device-Robust Acoustic Scene Classification
    Ren, Zhao
    Kong, Qiuqiang
    Han, Jing
    Plumbley, Mark D.
    Schuller, Bjoern W.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 4131 - 4142
  • [42] Device Robust Acoustic Scene Classification Using Adaptive Noise Reduction and Convolutional Recurrent Attention Neural Network
    Venkatesh, Spoorthy
    Koolagudi, Shashidhar G.
    SPEECH AND COMPUTER, SPECOM 2022, 2022, 13721 : 688 - 699
  • [43] Shallow Convolutional Neural Networks for Acoustic Scene Classification
    LU Lu
    YANG Yuhong
    JIANG Yuzhi
    AI Haojun
    TU Weiping
    WuhanUniversityJournalofNaturalSciences, 2018, 23 (02) : 178 - 184
  • [44] Deep Neural Decision Forest for Acoustic Scene Classification
    Sun, Jianyuan
    Liu, Xubo
    Mei, Xinhao
    Zhao, Jinzheng
    Plumbley, Mark D.
    Kilic, Volkan
    Wang, Wenwu
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 772 - 776
  • [45] PROTOTYPICAL NETWORKS FOR DOMAIN ADAPTATION IN ACOUSTIC SCENE CLASSIFICATION
    Singh, Shubhr
    Bear, Helen L.
    Benetos, Emmanouil
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 346 - 350
  • [46] Acoustic Scene Classification using Deep Fisher network
    Venkatesh, Spoorthy
    Mulimani, Manjunath
    Koolagudi, Shashidhar G.
    DIGITAL SIGNAL PROCESSING, 2023, 139
  • [47] A DATABASE AND CHALLENGE FOR ACOUSTIC SCENE CLASSIFICATION AND EVENT DETECTION
    Giannoulis, Dimitrios
    Stowell, Dan
    Benetos, Emmanouil
    Rossignol, Mathias
    Lagrange, Mathieu
    Plumbley, Mark D.
    2013 PROCEEDINGS OF THE 21ST EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2013,
  • [48] Deep mutual attention network for acoustic scene classification
    Xie, Wei
    He, Qianhua
    Yu, Zitong
    Li, Yanxiong
    Digital Signal Processing: A Review Journal, 2022, 123
  • [49] Acoustic Scene Classification Using Reduced MobileNet Architecture
    Xu, Jun-Xiang
    Lin, Tzu-Ching
    Yu, Tsai-Ching
    Tai, Tzu-Chiang
    Chang, Pao-Chi
    2018 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2018), 2018, : 267 - 270
  • [50] An Investigation of Transfer Learning Mechanism for Acoustic Scene Classification
    Zhou, Hengshun
    Bai, Xue
    Du, Jun
    2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 404 - 408