A Deep Residual Network for Large-Scale Acoustic Scene Analysis

被引:32
|
作者
Ford, Logan [1 ]
Tang, Hao [1 ]
Grondin, Francois [1 ]
Glass, James [1 ]
机构
[1] MIT, Comp Sci & Artificial Intelligence Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
关键词
acoustic scene analysis; audio classification; audio event detection; AUDIO CLASSIFICATION; NEURAL-NETWORKS;
D O I
10.21437/Interspeech.2019-2731
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Many of the recent advances in audio event detection, particularly on the AudioSet data set, have focused on improving performance using the released embeddings produced by a pretrained model. In this work, we instead study the task of training a multi-label event classifier directly from the audio recordings of AudioSet. Using the audio recordings, not only are we able to reproduce results from prior work, we have also confirmed improvements of other proposed additions, such as an attention module. Moreover, by training the embedding network jointly with the additions, we achieve an mAP of 0.392 and an AUC of 0.971, surpassing the state of the art without transfer learning from a large data set. We also analyze the output activations of the network and find that the models are able to localize audio events when a finer time resolution is needed.
引用
收藏
页码:2568 / 2572
页数:5
相关论文
共 50 条
  • [1] Novel CNN Architecture with Residual Learning and Deep Supervision for Large-Scale Scene Image Categorization
    Al-Barazanchi, Hussein A.
    Qassim, Hussam
    Verma, Abhishek
    2016 IEEE 7TH ANNUAL UBIQUITOUS COMPUTING, ELECTRONICS MOBILE COMMUNICATION CONFERENCE (UEMCON), 2016,
  • [2] LARGE-SCALE AUDIO FEATURE EXTRACTION AND SVM FOR ACOUSTIC SCENE CLASSIFICATION
    Geiger, Juergen T.
    Schuller, Bjoern
    Rigoll, Gerhard
    2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2013,
  • [3] An Investigation on Multiscale Normalised Deep Scattering Spectrum with Deep Residual Network for Acoustic Scene Classification
    Kek, Xing Yong
    Chin, Cheng Siong
    Li, Ye
    22ND IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD 2021-FALL), 2021, : 29 - 36
  • [4] Sparse Representation based Deep Residual Geometry Compression Network for Large-scale Point Clouds
    Yu, Pengpeng
    Zuo, Dian
    Huang, Yueer
    Huang, Ruishan
    Wang, Hanyun
    Guo, Yulan
    Liang, Fan
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2555 - 2560
  • [5] RETRACTED ARTICLE: A new deep representation for large-scale scene classification
    Bo Dai
    Feng Mei
    Deliang Ji
    Caiyou Zhang
    Jia Shi
    Multimedia Tools and Applications, 2020, 79 : 9689 - 9689
  • [6] Analysis of Deep Neural Network Models for Acoustic Scene Classification
    Basbug, Ahmet Melih
    Sert, Mustafa
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [7] Large-Scale Nodes Classification With Deep Aggregation Network
    Li, Jiangtao
    Wu, Jianshe
    He, Weiquan
    Zhou, Peng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2560 - 2572
  • [8] HammingMesh: A Network Topology for Large-Scale Deep Learning
    Hoefler, Torsten
    Bonato, Tommaso
    De Sensi, Daniele
    Di Girolamo, Salvatore
    Li, Shigang
    Heddes, Marco
    Belk, Jon
    Goel, Deepak
    Castro, Miguel
    Scott, Steve
    SC22: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2022,
  • [9] HammingMesh: A Network Topology for Large-Scale Deep Learning
    Hoefler, Torsten
    Bonoto, Tommaso
    De Sensi, Daniele
    Di Girolamo, Salvatore
    Li, Shigang
    Heddes, Marco
    Goel, Deepak
    Castro, Miguel
    Scott, Steve
    Communications of the ACM, 2024, 67 (12) : 97 - 105
  • [10] Large-Scale Stochastic Scene Generation and Semantic Annotation for Deep Convolutional Neural Network Training in the RoboCup SPL
    Hess, Timm
    Mundt, Martin
    Weis, Tobias
    Ramesh, Visvanathan
    ROBOCUP 2017: ROBOT WORLD CUP XXI, 2018, 11175 : 33 - 44