Learning How to Listen: A Temporal-Frequential Attention Model for Sound Event Detection

被引：6

作者：

Shen, Yu-Han ^{[1
]}

He, Ke-Xin ^{[1
]}

Zhang, Wei-Qiang ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China

来源：

INTERSPEECH 2019 | 2019年

基金：

中国国家自然科学基金;

关键词：

sound event detection; convolutional neural network; recurrent neural network; attention model; temporal-frequential attention;

D O I：

10.21437/Interspeech.2019-2045

中图分类号：

R36 [病理学]; R76 [耳鼻咽喉科学];

学科分类号：

100104 ; 100213 ;

摘要：

In this paper, we propose a temporal-frequential attention model for sound event detection (SED). Our network learns how to listen with two attention models: a temporal attention model and a frequential attention model. Proposed system learns when to listen using the temporal attention model while it learns where to listen on the frequency axis using the frequential attention model. With these two models, we attempt to make our system pay more attention to important frames or segments and important frequency components for sound event detection. Our proposed method is demonstrated on the task 2 of Detection and Classification of Acoustic Scenes and Events (DCASE) 2017 Challenge and outperforms state-of-the-art methods.

引用

页码：2563 / 2567

页数：5

共 50 条

[1] Polyphonic Sound Event Detection Using Temporal-Frequency Attention and Feature Space Attention
Jin, Ye
Wang, Mei
Luo, Liyan
Zhao, Dinghao
Liu, Zhanqi
SENSORS, 2022, 22 (18)
[2] Event Specific Attention for Polyphonic Sound Event Detection
Sundar, Harshavardhan
Sun, Ming
Wang, Chao
INTERSPEECH 2021, 2021, : 566 - 570
[3] Medical education of attention: A qualitative study of learning to listen to sound
Harris, Anna
Flynn, Eleanor
MEDICAL TEACHER, 2017, 39 (01) : 79 - 84
[4] Decoupling Temporal Convolutional Networks Model in Sound Event Detection and Localization
Song, Shen
Zhang, Cong
You, Xinyuan
JOURNAL OF INTERNET TECHNOLOGY, 2023, 24 (01): : 89 - 99
[5] A Progressive Learning Approach for Sound Event Detection with Temporal and Spectral Features Fusion
Zhong, Yilin
Fang, Zhaoer
Wang, Jie
Fan, Bo
Peng, BangHuang
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT V, ICIC 2024, 2024, 14866 : 207 - 218
[6] Active Learning for Sound Event Detection
Shuyang Zhao
Heittola, Toni
Virtanen, Tuomas
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2895 - 2905
[7] Sound Event Localization and Detection Based on Dual Attention
Xu, Chundong
Liu, Hao
Min, Yuan
Zhen, Yadi
Computer Engineering and Applications, 2023, 59 (19) : 99 - 105
[8] INCREMENTAL LEARNING ALGORITHM FOR SOUND EVENT DETECTION
Koh, Eunjeong
Saki, Fatemeh
Guo, Yinyi
Hung, Cheng-Yu
Visser, Erik
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[9] Joint Spatio-Temporal-Frequency Representation Learning for Improved Sound Event Localization and Detection
Chen, Baoqing
Wang, Mei
Gu, Yu
SENSORS, 2024, 24 (18)
[10] CONNECTIONIST TEMPORAL LOCALIZATION FOR SOUND EVENT DETECTION WITH SEQUENTIAL LABELING
Wang, Yun
Metze, Florian
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 745 - 749

← 1 2 3 4 5 →