SCALABLE NEURAL ARCHITECTURES FOR END-TO-END ENVIRONMENTAL SOUND CLASSIFICATION

被引:4
|
作者
Paissan, Francesco [1 ]
Ancilotto, Alberto [1 ]
Brutti, Alessio [1 ]
Farella, Elisabetta [1 ]
机构
[1] Fdn Bruno Kessler, Digital Soc DiGis Ctr, Povo, Italy
基金
欧盟地平线“2020”;
关键词
sound event detection; tinyML; scalable backbone; IoT; NETWORKS;
D O I
10.1109/ICASSP43922.2022.9746093
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Sound Event Detection (SED) is a complex task simulating human ability to recognize what is happening in the surrounding from auditory signals only. This technology is a crucial asset in many applications such as smart cities. Here, urban sounds can be detected and processed by embedded devices in an Internet of Things (IoT) to identify meaningful events for municipalities or law enforcement. However, while current deep learning techniques for SED are effective, they are also resource- and power-hungry, thus not appropriate for pervasive battery-powered devices. In this paper, we propose novel neural architectures based on PhiNets for real-time acoustic event detection on microcontroller units. The proposed models are easily scalable to fit the hardware requirements and can operate both on spectrograms and waveforms. In particular, our architectures achieve state-of-the-art performance on UrbanSound8K in spectrogram classification (around 77%) with extreme compression factors (99.8%) with respect to current state-of-the-art architectures.
引用
收藏
页码:641 / 645
页数:5
相关论文
共 50 条
  • [1] End-to-end environmental sound classification using a 1D convolutional neural network
    Abdoli, Sajjad
    Cardinal, Patrick
    Koerich, Alessandro Lameiras
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 136 : 252 - 263
  • [2] Scalable end-to-end recurrent neural network for variable star classification
    Becker, I
    Pichara, K.
    Catelan, M.
    Protopapas, P.
    Aguirre, C.
    Nikzat, F.
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2020, 493 (02) : 2981 - 2995
  • [3] Interpretable End-to-End heart sound classification
    Li, Shuaizhong
    Sun, Jing
    Yang, Hongbo
    Pan, Jiahua
    Guo, Tao
    Wang, Weilian
    [J]. MEASUREMENT, 2024, 237
  • [4] Lightweight End-to-End Neural Network Model for Automatic Heart Sound Classification
    Li, Tao
    Yin, Yibo
    Ma, Kainan
    Zhang, Sitao
    Liu, Ming
    [J]. INFORMATION, 2021, 12 (02) : 1 - 11
  • [5] End-to-end Neural Information Status Classification
    Hou, Yufang
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1377 - 1388
  • [6] End-to-End Neural Text Classification for Tibetan
    Qun, Nuo
    Li, Xing
    Qiu, Xipeng
    Huang, Xuanjing
    [J]. CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2017, 2017, 10565 : 472 - 480
  • [7] MSARN: A Multi-scale Attention Residual Network for End-to-End Environmental Sound Classification
    Hu, Fucai
    Song, Peng
    He, Ruhan
    Yan, Zhaoli
    Yu, Yongsheng
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (08) : 11449 - 11465
  • [8] MSARN: A Multi-scale Attention Residual Network for End-to-End Environmental Sound Classification
    Fucai Hu
    Peng Song
    Ruhan He
    Zhaoli Yan
    Yongsheng Yu
    [J]. Neural Processing Letters, 2023, 55 : 11449 - 11465
  • [9] An End-to-end System Based on TDNN for Lung Sound Classification
    Liu, Lingling
    Li, Lin
    Li, Song
    Wu, Jinzhun
    Guo, Donghui
    [J]. 2020 IEEE 14TH INTERNATIONAL CONFERENCE ON ANTI-COUNTERFEITING, SECURITY, AND IDENTIFICATION (ASID), 2020, : 20 - 24
  • [10] Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition
    Moritz, Niko
    Hori, Takaaki
    Le Roux, Jonathan
    [J]. INTERSPEECH 2019, 2019, : 76 - 80