Sound event localization and detection using element-wise attention gate and asymmetric convolutional recurrent neural networks

被引:0
|
作者
Yan, Lean [1 ]
Guo, Min [1 ]
Li, Zhiqiang [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Minist Educ, Key Lab Modern Teaching Technol, Xian 710119, Peoples R China
基金
中国国家自然科学基金;
关键词
Sound event localization and detection; asymmetric convolution; context gating; squeeze excitation; element-wise attention gate;
D O I
10.3233/AIC-220125
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There are problems that standard square convolution kernel has insufficient representation ability and recurrent neural network usually ignores the importance of different elements within an input vector in sound event localization and detection. This paper proposes an element-wise attention gate-asymmetric convolutional recurrent neural network (EleAttG-ACRNN), to improve the performance of sound event localization and detection. First, a convolutional neural network with context gating and asymmetric squeeze excitation residual is constructed, where asymmetric convolution enhances the capability of the square convolution kernel; squeeze excitation can improve the interdependence between channels; context gating can weight the important features and suppress the irrelevant features. Next, in order to improve the expressiveness of the model, we integrate the element-wise attention gate into the bidirectional gated recurrent network, which is to highlight the importance of different elements within an input vector, and further learn the temporal context information. Evaluation results using the TAU Spatial Sound Events 2019-Ambisonic dataset show the effectiveness of the proposed method, and it improves SELD performance up to 0.05 in error rate, 1.7% in F-score, 0.7 degrees in DOA error, and 4.5% in Frame recall compared to a CRNN method.
引用
收藏
页码:147 / 157
页数:11
相关论文
共 50 条
  • [41] Sound Event Localization and Detection Using Parallel Multi-attention Enhancement
    Zhengyu Chen
    Qinghua Huang
    Circuits, Systems, and Signal Processing, 2024, 43 (1) : 545 - 567
  • [42] Sound Event Localization and Detection Using Parallel Multi-attention Enhancement
    Chen, Zhengyu
    Huang, Qinghua
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (01) : 545 - 567
  • [43] Sound Event Detection in Cowshed using Synthetic Data and Convolutional Neural Network
    Pandeya, Yagya Raj
    Bhattarai, Bhuwan
    Lee, Joonwhoan
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 273 - 276
  • [44] Sound Event Detection in Underground Parking Garage Using Convolutional Neural Network
    Ciaburro, Giuseppe
    BIG DATA AND COGNITIVE COMPUTING, 2020, 4 (03) : 1 - 14
  • [45] QUATERNION CONVOLUTIONAL NEURAL NETWORKS FOR DETECTION AND LOCALIZATION OF 3D SOUND EVENTS
    Comminiello, Danilo
    Lella, Marco
    Scardapane, Simone
    Uncini, Aurelio
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8533 - 8537
  • [46] Attentive Convolutional Recurrent Neural Network Using Phoneme-Level Acoustic Representation for Rare Sound Event Detection
    Upadhyay, Shreya G.
    Su, Bo-Hao
    Lee, Chi-Chun
    INTERSPEECH 2020, 2020, : 3102 - 3106
  • [47] Arrhythmia Detection Using Convolutional Neural Networks with Temporal Attention Mechanism
    Zubair, Muhammad
    Woo, Sungpil
    Lim, Sunhwan
    Park, Chan-Won
    12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1101 - 1103
  • [48] Sound Events Localization and Detection Using Bio-Inspired Gammatone Filters and Temporal Convolutional Neural Networks
    Rosero, Karen
    Grijalva, Felipe
    Masiero, Bruno
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 2314 - 2324
  • [49] Detection of precursors of combustion instability using convolutional recurrent neural networks
    Cellier, A.
    Lapeyre, C. J.
    Oztarlik, G.
    Poinsot, T.
    Schuller, T.
    Selle, L.
    COMBUSTION AND FLAME, 2021, 233
  • [50] Smoke Detection on Video Sequences Using Convolutional and Recurrent Neural Networks
    Filonenko, Alexander
    Kurnianggoro, Laksono
    Jo, Kang-Hyun
    COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2017, PT II, 2017, 10449 : 558 - 566