Learning How to Listen: A Temporal-Frequential Attention Model for Sound Event Detection

被引:6
|
作者
Shen, Yu-Han [1 ]
He, Ke-Xin [1 ]
Zhang, Wei-Qiang [1 ]
机构
[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
sound event detection; convolutional neural network; recurrent neural network; attention model; temporal-frequential attention;
D O I
10.21437/Interspeech.2019-2045
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
In this paper, we propose a temporal-frequential attention model for sound event detection (SED). Our network learns how to listen with two attention models: a temporal attention model and a frequential attention model. Proposed system learns when to listen using the temporal attention model while it learns where to listen on the frequency axis using the frequential attention model. With these two models, we attempt to make our system pay more attention to important frames or segments and important frequency components for sound event detection. Our proposed method is demonstrated on the task 2 of Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 Challenge and outperforms state-of-the-art methods.
引用
收藏
页码:2563 / 2567
页数:5
相关论文
共 50 条
  • [1] Polyphonic Sound Event Detection Using Temporal-Frequency Attention and Feature Space Attention
    Jin, Ye
    Wang, Mei
    Luo, Liyan
    Zhao, Dinghao
    Liu, Zhanqi
    SENSORS, 2022, 22 (18)
  • [2] Event Specific Attention for Polyphonic Sound Event Detection
    Sundar, Harshavardhan
    Sun, Ming
    Wang, Chao
    INTERSPEECH 2021, 2021, : 566 - 570
  • [3] Medical education of attention: A qualitative study of learning to listen to sound
    Harris, Anna
    Flynn, Eleanor
    MEDICAL TEACHER, 2017, 39 (01) : 79 - 84
  • [4] Decoupling Temporal Convolutional Networks Model in Sound Event Detection and Localization
    Song, Shen
    Zhang, Cong
    You, Xinyuan
    JOURNAL OF INTERNET TECHNOLOGY, 2023, 24 (01): : 89 - 99
  • [5] A Progressive Learning Approach for Sound Event Detection with Temporal and Spectral Features Fusion
    Zhong, Yilin
    Fang, Zhaoer
    Wang, Jie
    Fan, Bo
    Peng, BangHuang
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT V, ICIC 2024, 2024, 14866 : 207 - 218
  • [6] Active Learning for Sound Event Detection
    Shuyang Zhao
    Heittola, Toni
    Virtanen, Tuomas
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2895 - 2905
  • [7] Sound Event Localization and Detection Based on Dual Attention
    Xu, Chundong
    Liu, Hao
    Min, Yuan
    Zhen, Yadi
    Computer Engineering and Applications, 2023, 59 (19) : 99 - 105
  • [8] INCREMENTAL LEARNING ALGORITHM FOR SOUND EVENT DETECTION
    Koh, Eunjeong
    Saki, Fatemeh
    Guo, Yinyi
    Hung, Cheng-Yu
    Visser, Erik
    2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [9] Joint Spatio-Temporal-Frequency Representation Learning for Improved Sound Event Localization and Detection
    Chen, Baoqing
    Wang, Mei
    Gu, Yu
    SENSORS, 2024, 24 (18)
  • [10] CONNECTIONIST TEMPORAL LOCALIZATION FOR SOUND EVENT DETECTION WITH SEQUENTIAL LABELING
    Wang, Yun
    Metze, Florian
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 745 - 749