Sound event localization and detection using element-wise attention gate and asymmetric convolutional recurrent neural networks

被引:0
|
作者
Yan, Lean [1 ]
Guo, Min [1 ]
Li, Zhiqiang [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Minist Educ, Key Lab Modern Teaching Technol, Xian 710119, Peoples R China
基金
中国国家自然科学基金;
关键词
Sound event localization and detection; asymmetric convolution; context gating; squeeze excitation; element-wise attention gate;
D O I
10.3233/AIC-220125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are problems that standard square convolution kernel has insufficient representation ability and recurrent neural network usually ignores the importance of different elements within an input vector in sound event localization and detection. This paper proposes an element-wise attention gate-asymmetric convolutional recurrent neural network (EleAttG-ACRNN), to improve the performance of sound event localization and detection. First, a convolutional neural network with context gating and asymmetric squeeze excitation residual is constructed, where asymmetric convolution enhances the capability of the square convolution kernel; squeeze excitation can improve the interdependence between channels; context gating can weight the important features and suppress the irrelevant features. Next, in order to improve the expressiveness of the model, we integrate the element-wise attention gate into the bidirectional gated recurrent network, which is to highlight the importance of different elements within an input vector, and further learn the temporal context information. Evaluation results using the TAU Spatial Sound Events 2019-Ambisonic dataset show the effectiveness of the proposed method, and it improves SELD performance up to 0.05 in error rate, 1.7% in F-score, 0.7 degrees in DOA error, and 4.5% in Frame recall compared to a CRNN method.
引用
收藏
页码:147 / 157
页数:11
相关论文
共 50 条
  • [1] Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks
    Adavanne, Sharath
    Politis, Archontis
    Nikunen, Joonas
    Virtanen, Tuomas
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (01) : 34 - 48
  • [2] Sound Event Localization and Detection Using Convolutional Recurrent Neural Networks and Gated Linear Units
    Komatsu, Tatsuya
    Togami, Masahito
    Takahashi, Tsubasa
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 41 - 45
  • [3] Improving robustness of convolutional neural networks using element-wise activation scaling
    Zhang, Zhi-Yuan
    Ren, Hao
    He, Zhenli
    Zhou, Wei
    Liu, Di
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2023, 149 : 136 - 148
  • [4] Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
    Cakir, Emre
    Parascandolo, Giambattista
    Heittola, Toni
    Huttunen, Heikki
    Virtanen, Tuomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1291 - 1303
  • [5] SOUND EVENT DETECTION VIA DILATED CONVOLUTIONAL RECURRENT NEURAL NETWORKS
    Li, Yanxiong
    Liu, Mingle
    Drossos, Konstantinos
    Virtanen, Tuomas
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 286 - 290
  • [6] Selection of element-wise shell kinematics using neural networks
    Petrolo, M.
    Carrera, E.
    COMPUTERS & STRUCTURES, 2021, 244
  • [7] ON THE ACCURACY AND EFFICIENCY OF CONVOLUTIONAL NEURAL NETWORKS FOR ELEMENT-WISE REFINEMENT OF FEM MODELS
    Petrolo, M.
    Iannotti, P.
    Pagani, A.
    Carrera, E.
    PROCEEDINGS OF ASME 2022 INTERNATIONAL MECHANICAL ENGINEERING CONGRESS AND EXPOSITION, IMECE2022, VOL 3, 2022,
  • [8] Smart IoT Network Based Convolutional Recurrent Neural Network With Element-Wise Prediction System
    Al-Jamali, Nadia Adnan Shiltagh
    Al-Raweshidy, Hamed S.
    IEEE ACCESS, 2021, 9 : 47864 - 47874
  • [9] Attention mechanism combined with residual recurrent neural network for sound event detection and localization
    Lan, Chaofeng
    Zhang, Lei
    Zhang, Yuanyuan
    Fu, Lirong
    Sun, Chao
    Han, Yulan
    Zhang, Meng
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2022, 2022 (01)
  • [10] Attention mechanism combined with residual recurrent neural network for sound event detection and localization
    Chaofeng Lan
    Lei Zhang
    Yuanyuan Zhang
    Lirong Fu
    Chao Sun
    Yulan Han
    Meng Zhang
    EURASIP Journal on Audio, Speech, and Music Processing, 2022