Sound Event Localization and Detection Using Parallel Multi-attention Enhancement

被引:1
|
作者
Chen, Zhengyu [1 ]
Huang, Qinghua [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai, Peoples R China
基金
中国国家自然科学基金;
关键词
Sound event localization and detection; Parallel multi-attention; Global information; Feature fusion; DEEP NEURAL-NETWORKS; RECOGNITION;
D O I
10.1007/s00034-023-02489-x
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
As a combination of sound event detection and direction of arrival, the joint task of sound event localization and detection (SELD) is an emerging audio signal processing task and is applied in many areas widely. A popular convolutional recurrent neural network (CRNN)-based method uses convolution neural network (CNN) to extract high-level space features from manually designed features and utilizes recurrent neural network to model sequence context information. Some studies have shown that the normal CNN could not be robust in challenging acoustic environments such as overlapping, moving and discontinuous sources. To improve the performance of SELD in more complex acoustic scenes, parallel multi-attention enhancement (PMAE) is proposed as a convolution enhancement method to boost the representation ability of CNN in this paper. PMAE consists of attention feature enhancement (AFE) and parallel multi-attention (PMA) block. PMA, embedded into AFE, extracts boosting global-local features by efficient attention modules along with different dimensions. AFE, as a feature fusion strategy, fuses multi-scale enhanced features to improve feature representation. AFE shows great performance for overlapping sources. PMA adequately extracts characteristic information of different sound events and shows better performance on moving and discontinuous sources when it is combined with AFE. Based on such a framework, the SELD system becomes robust, while the target sources are moving and overlapping with unknown interference classes. The simulations show that proposed PMAE improves the performance enormously for SELD without other technologies, such as data augment and post-processing.
引用
下载
收藏
页码:545 / 567
页数:23
相关论文
共 50 条
  • [21] Sound event localization and detection using element-wise attention gate and asymmetric convolutional recurrent neural networks
    Yan, Lean
    Guo, Min
    Li, Zhiqiang
    AI COMMUNICATIONS, 2023, 36 (02) : 147 - 157
  • [22] Polyphonic Sound Event Detection Using Temporal-Frequency Attention and Feature Space Attention
    Jin, Ye
    Wang, Mei
    Luo, Liyan
    Zhao, Dinghao
    Liu, Zhanqi
    SENSORS, 2022, 22 (18)
  • [23] A Joint Detection-Classification Model for Weakly Supervised Sound Event Detection Using Multi-Scale Attention Method
    Wang, Yaoguang
    He, Liang
    2020 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2020), 2020,
  • [24] Event Detection using Hierarchical Multi-Aspect Attention
    Mehta, Sneha
    Islam, Mohammad Raihanul
    Rangwala, Huzefa
    Ramakrishnan, Naren
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 3079 - 3085
  • [25] Multi-attention guided and feature enhancement network for vehicle re-identification
    Yu, Yang
    He, Kun
    Yan, Gang
    Cen, Shixin
    Li, Yang
    Yu, Ming
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (01) : 673 - 690
  • [26] Semantic-Guided Multi-Attention Localization for Zero-Shot Learning
    Zhu, Yizhe
    Xie, Jianwen
    Tang, Zhiqiang
    Peng, Xi
    Elgammal, Ahmed
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [27] Multi-Attention Integrated Convolutional Network for Anomaly Detection of Time Series
    Zhang, Jing
    Wang, Chao
    Zhang, Xianbo
    Li, Zezhou
    2022 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2022), 2022, : 91 - 96
  • [28] Personal-Bullying Detection Based on Multi-Attention and Cognitive Feature
    Niu, M.
    Yu, L.
    Tian, S.
    Wang, X.
    Zhang, Q.
    AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2020, 54 (01) : 52 - 61
  • [29] Personal-Bullying Detection Based on Multi-Attention and Cognitive Feature
    M. Niu
    L. Yu
    S. Tian
    X. Wang
    Q. Zhang
    Automatic Control and Computer Sciences, 2020, 54 : 52 - 61
  • [30] Multi-attention guided feature fusion network for salient object detection
    Li, Anni
    Qi, JinQing
    Lu, Huchuan
    NEUROCOMPUTING, 2020, 411 : 416 - 427