Acoustic Scene Classification using Deep Fisher network

被引:1
|
作者
Venkatesh, Spoorthy [1 ]
Mulimani, Manjunath [2 ]
Koolagudi, Shashidhar G. [1 ]
机构
[1] Natl Inst Technol Karnataka, Dept Comp Sci & Engn, Surathkal, India
[2] Tampere Univ, Fac Informat Technol & Commun Sci, Tampere, Finland
关键词
Acoustic Scene Classification (ASC); Fisher network; Fisher vector encoding; Fisher layer; Principal Component Analysis (PCA);
D O I
10.1016/j.dsp.2023.104062
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Acoustic Scene Classification (ASC) is the task of assigning a semantic label to an audio recording, based on the surrounding environment. In this work, a Fisher network is introduced for ASC. The proposed method mimics the working mechanism of a feed-forward Convolutional Neural Network (CNN) where, output of a layer is fed as an input to the succeeding layer. The Fisher network consists of a feature extraction step followed by a Fisher layer. The Fisher layer has three sub-layers, namely, Fisher Vector (FV) encoder, temporal pyramid and normalization layers along with feature reduction layer. Gammatone Time Cepstral Coefficients (GTCCs) and Mel-spectrograms are the features encoded as Fisher vector representation in FV encoder sub-layer. Temporal information of the Fisher vectors is retained using temporal pyramid sub-layer. After temporal pyramids are extracted from Fisher vectors, they are available as a feature vector. Irrelevant dimensions of the temporal pyramids are reduced further using Principal Component Analysis (PCA) in normalization and PCA sub-layers. The proposed model is evaluated on five DCASE datasets, TUT Urban Acoustic Scenes 2018 and Mobile, DCASE 2019 Acoustic Scene Classification Task 1(a) and Task 1(b), TAU Urban Acoustic Scenes 2020 datasets. The overall classification accuracy is 93%, 91%, 92%, 91% and 89% for TUT 2018, TUT Mobile 2018, DCASE Task 1(a) 2019, DCASE Task 1(b) 2019, and TAU Urban Acoustic Scenes 2020 datasets, respectively. The proposed model performed much better than the state-of-the-art ASC systems. (c) 2023 Elsevier Inc. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Acoustic Scene Classification Using Deep Audio Feature and BLSTM Network
    Li, Yanxiong
    Li, Xianku
    Zhang, Yuhan
    Wang, Wucheng
    Liu, Mingle
    Feng, Xiaohui
    [J]. 2018 INTERNATIONAL CONFERENCE ON AUDIO, LANGUAGE AND IMAGE PROCESSING (ICALIP), 2018, : 371 - 374
  • [2] Deep mutual attention network for acoustic scene classification
    Xie, Wei
    He, Qianhua
    Yu, Zitong
    Li, Yanxiong
    [J]. Digital Signal Processing: A Review Journal, 2022, 123
  • [3] Deep mutual attention network for acoustic scene classification
    Xie, Wei
    He, Qianhua
    Yu, Zitong
    Li, Yanxiong
    [J]. DIGITAL SIGNAL PROCESSING, 2022, 123
  • [4] Analysis of Deep Neural Network Models for Acoustic Scene Classification
    Basbug, Ahmet Melih
    Sert, Mustafa
    [J]. 2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [5] Acoustic Scene Classification using Deep Learning Architectures
    Spoorthy, V
    Mulimani, Manjunath
    Koolagudi, Shashidhar G.
    [J]. 2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [6] Acoustic Scene Classification Using Deep Convolutional Neural Network via Transfer Learning
    Ye, Min
    Zhong, Hong
    Song, Xiao
    Huang, Shilei
    Cheng, Gang
    [J]. PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 19 - 22
  • [7] Acoustic scene classification using projection Kervolutional neural network
    Mulimani, Manjunath
    Nandi, Ritika
    Koolagudi, Shashidhar G.
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (06) : 9447 - 9457
  • [8] Acoustic scene classification using projection Kervolutional neural network
    Manjunath Mulimani
    Ritika Nandi
    Shashidhar G Koolagudi
    [J]. Multimedia Tools and Applications, 2023, 82 : 9447 - 9457
  • [9] An Investigation on Multiscale Normalised Deep Scattering Spectrum with Deep Residual Network for Acoustic Scene Classification
    Kek, Xing Yong
    Chin, Cheng Siong
    Li, Ye
    [J]. 22ND IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD 2021-FALL), 2021, : 29 - 36
  • [10] Capturing Discriminative Information Using a Deep Architecture in Acoustic Scene Classification
    Shim, Hye-jin
    Jung, Jee-weon
    Kim, Ju-ho
    Yu, Ha-jin
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (18):