Relational recurrent neural networks for polyphonic sound event detection

被引：0

作者：

Junbo Ma

Ruili Wang

Wanting Ji

Hao Zheng

En Zhu

Jianping Yin

机构：

[1] Massey University,School of Computer

[2] National University of Defense Technology,College of information engineering

[3] Zhejiang Gongshang University,School of Computer

[4] Nanjing Xiaozhuang University,undefined

[5] National University of Defense Technology,undefined

[6] Dongguan University of Technology,undefined

来源：

Multimedia Tools and Applications | 2019年 / 78卷

关键词：

Internet of Things; smart environment; deep neural networks; recurrent neural networks; sound event detection;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

A smart environment is one of the application scenarios of the Internet of Things (IoT). In order to provide a ubiquitous smart environment for humans, a variety of technologies are developed. In a smart environment system, sound event detection is one of the fundamental technologies, which can automatically sense sound changes in the environment and detect sound events that cause changes. In this paper, we propose the use of Relational Recurrent Neural Network (RRNN) for polyphonic sound event detection, called RRNN-SED, which utilized the strength of RRNN in long-term temporal context extraction and relational reasoning across a polyphonic sound signal. Different from previous sound event detection methods, which rely heavily on convolutional neural networks or recurrent neural networks, the proposed RRNN-SED method can solve long-lasting and overlapping problems in polyphonic sound event detection. Specifically, since the historical information memorized inside RRNNs is capable of interacting with each other across a polyphonic sound signal, the proposed RRNN-SED method is effective and efficient in extracting temporal context information and reasoning the unique relational characteristic of the target sound events. Experimental results on two public datasets show that the proposed method achieved better sound event detection results in terms of segment-based F-score and segment-based error rate.

引用

下载

页码：29509 / 29527

页数：18

共 50 条

[21] PEER COLLABORATIVE LEARNING FOR POLYPHONIC SOUND EVENT DETECTION
Endo, Hayato
Nishizaki, Hiromitsu
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 826 - 830
[22] Heart Sound Segmentation-An Event Detection Approach Using Deep Recurrent Neural Networks
Messner, Elmar
Zoehrer, Matthias
Pernkopf, Franz
IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2018, 65 (09) : 1964 - 1974
[23] Sound Event Localization and Detection Using Convolutional Recurrent Neural Networks and Gated Linear Units
Komatsu, Tatsuya
Togami, Masahito
Takahashi, Tsubasa
28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 41 - 45
[24] Parallel Capsule Neural Networks for Sound Event Detection
Liang, Kai-Wen
Tseng, Yu-Hao
Chang, Pao-Chi
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1933 - 1936
[25] Sound Event Detection with Perturbed Residual Recurrent Neural Network
Yuan, Shuang
Yang, Lidong
Guo, Yong
ELECTRONICS, 2023, 12 (18)
[26] Relational recurrent neural networks
Santoro, Adam
Faulkner, Ryan
Raposo, David
Rae, Jack
Chrzanowski, Mike
Weber, Theophane
Wierstre, Daan
Vinyals, Oriol
Pascanu, Razvan
Lillicrap, Timothy
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[27] Bidirectional recurrent neural networks for seismic event detection
Birnie, Claire
Hansteen, Fredrik
GEOPHYSICS, 2022, 87 (03) : KS97 - KS111
[28] Prediction of Polyphonic Alarm Sound by Deep Neural Networks
Kishimoto K.
Takemura T.
Sugiyama O.
Kojima R.
Yakami M.
Nambu M.
Fujii K.
Kuroda T.
Transactions of Japanese Society for Medical and Biological Engineering, 2022, 60 (01) : 8 - 15
[29] Technical Sound Event Classification Applying Recurrent and Convolutional Neural Networks
Rieder, Constantin
Germann, Markus
Mezger, Samuel
Scherer, Klaus
PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON DEEP LEARNING THEORY AND APPLICATIONS (DELTA), 2020, : 84 - 88
[30] Duration-Controlled LSTM for Polyphonic Sound Event Detection
Hayashi, Tomoki
Watanabe, Shinji
Toda, Tomoki
Hori, Takaaki
Le Roux, Jonathan
Takeda, Kazuya
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2059 - 2070

← 1 2 3 4 5 →