Sound event localization and detection using element-wise attention gate and asymmetric convolutional recurrent neural networks

被引:0
|
作者
Yan, Lean [1 ]
Guo, Min [1 ]
Li, Zhiqiang [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Minist Educ, Key Lab Modern Teaching Technol, Xian 710119, Peoples R China
基金
中国国家自然科学基金;
关键词
Sound event localization and detection; asymmetric convolution; context gating; squeeze excitation; element-wise attention gate;
D O I
10.3233/AIC-220125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are problems that standard square convolution kernel has insufficient representation ability and recurrent neural network usually ignores the importance of different elements within an input vector in sound event localization and detection. This paper proposes an element-wise attention gate-asymmetric convolutional recurrent neural network (EleAttG-ACRNN), to improve the performance of sound event localization and detection. First, a convolutional neural network with context gating and asymmetric squeeze excitation residual is constructed, where asymmetric convolution enhances the capability of the square convolution kernel; squeeze excitation can improve the interdependence between channels; context gating can weight the important features and suppress the irrelevant features. Next, in order to improve the expressiveness of the model, we integrate the element-wise attention gate into the bidirectional gated recurrent network, which is to highlight the importance of different elements within an input vector, and further learn the temporal context information. Evaluation results using the TAU Spatial Sound Events 2019-Ambisonic dataset show the effectiveness of the proposed method, and it improves SELD performance up to 0.05 in error rate, 1.7% in F-score, 0.7 degrees in DOA error, and 4.5% in Frame recall compared to a CRNN method.
引用
收藏
页码:147 / 157
页数:11
相关论文
共 50 条
  • [21] Weakly Labeled Semi-Supervised Sound Event Detection Based on Convolutional Independent Recurrent Neural Networks
    Dewang Changgeng Yu
    Xuanyu Yang
    Optical Memory and Neural Networks, 2022, 31 : 266 - 276
  • [22] Weakly Labeled Semi-Supervised Sound Event Detection Based on Convolutional Independent Recurrent Neural Networks
    Yu, Changgeng
    Yang, Dewang
    Liu, Xuanyu
    OPTICAL MEMORY AND NEURAL NETWORKS, 2022, 31 (03) : 266 - 276
  • [23] Polyphonic sound event localization and detection using channel-wise FusionNet
    Spoorthy, V.
    Kooolagudi, Shashidhar G.
    APPLIED INTELLIGENCE, 2024, 54 (06) : 5015 - 5026
  • [24] Hybrid Event Detection and Phase-Picking Algorithm Using Convolutional and Recurrent Neural Networks
    Zhou, Yijian
    Yue, Han
    Kong, Qingkai
    Zhou, Shiyong
    SEISMOLOGICAL RESEARCH LETTERS, 2019, 90 (03) : 1079 - 1087
  • [25] LOWLATENCY SOUND SOURCE SEPARATION USING CONVOLUTIONAL RECURRENT NEURAL NETWORKS
    Naithani, Gaurav
    Barker, Tom
    Parascandolo, Giambattista
    Bramslow, Lars
    Pontoppidan, Niels Henrik
    Virtanen, Tuomas
    2017 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2017, : 71 - 75
  • [26] Heart Sound Segmentation-An Event Detection Approach Using Deep Recurrent Neural Networks
    Messner, Elmar
    Zoehrer, Matthias
    Pernkopf, Franz
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2018, 65 (09) : 1964 - 1974
  • [27] SOUND SOURCE LOCALIZATION IN A MULTIPATH ENVIRONMENT USING CONVOLUTIONAL NEURAL NETWORKS
    Ferguson, Eric L.
    Williams, Stefan B.
    Jin, Craig T.
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2386 - 2390
  • [28] A Binaural Sound Localization System using Deep Convolutional Neural Networks
    Xu, Ying
    Afshar, Saeed
    Singh, Ram Kuber
    Wang, Runchun
    van Schaik, Andre
    Hamilton, Tara Julia
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [29] A GENERAL NETWORK ARCHITECTURE FOR SOUND EVENT LOCALIZATION AND DETECTION USING TRANSFER LEARNING AND RECURRENT NEURAL NETWORK
    Nguyen, Thi Ngoc Tho
    Nguyen, Ngoc Khanh
    Phan, Huy
    Pham, Lam
    Ooi, Kenneth
    Jones, Douglas L.
    Gan, Woon-Seng
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 935 - 939
  • [30] SOUND EVENT DETECTION BY CONSISTENCY TRAINING AND PSEUDO-LABELING WITH FEATURE-PYRAMID CONVOLUTIONAL RECURRENT NEURAL NETWORKS
    Koh, Chih-Yuan
    Chen, You-Siang
    Liu, Yi-Wen
    Bai, Mingsian R.
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 376 - 380