Relational recurrent neural networks for polyphonic sound event detection

被引:0
|
作者
Junbo Ma
Ruili Wang
Wanting Ji
Hao Zheng
En Zhu
Jianping Yin
机构
[1] Massey University,School of Computer
[2] National University of Defense Technology,College of information engineering
[3] Zhejiang Gongshang University,School of Computer
[4] Nanjing Xiaozhuang University,undefined
[5] National University of Defense Technology,undefined
[6] Dongguan University of Technology,undefined
来源
关键词
Internet of Things; smart environment; deep neural networks; recurrent neural networks; sound event detection;
D O I
暂无
中图分类号
学科分类号
摘要
A smart environment is one of the application scenarios of the Internet of Things (IoT). In order to provide a ubiquitous smart environment for humans, a variety of technologies are developed. In a smart environment system, sound event detection is one of the fundamental technologies, which can automatically sense sound changes in the environment and detect sound events that cause changes. In this paper, we propose the use of Relational Recurrent Neural Network (RRNN) for polyphonic sound event detection, called RRNN-SED, which utilized the strength of RRNN in long-term temporal context extraction and relational reasoning across a polyphonic sound signal. Different from previous sound event detection methods, which rely heavily on convolutional neural networks or recurrent neural networks, the proposed RRNN-SED method can solve long-lasting and overlapping problems in polyphonic sound event detection. Specifically, since the historical information memorized inside RRNNs is capable of interacting with each other across a polyphonic sound signal, the proposed RRNN-SED method is effective and efficient in extracting temporal context information and reasoning the unique relational characteristic of the target sound events. Experimental results on two public datasets show that the proposed method achieved better sound event detection results in terms of segment-based F-score and segment-based error rate.
引用
下载
收藏
页码:29509 / 29527
页数:18
相关论文
共 50 条
  • [21] PEER COLLABORATIVE LEARNING FOR POLYPHONIC SOUND EVENT DETECTION
    Endo, Hayato
    Nishizaki, Hiromitsu
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 826 - 830
  • [22] Heart Sound Segmentation-An Event Detection Approach Using Deep Recurrent Neural Networks
    Messner, Elmar
    Zoehrer, Matthias
    Pernkopf, Franz
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2018, 65 (09) : 1964 - 1974
  • [23] Sound Event Localization and Detection Using Convolutional Recurrent Neural Networks and Gated Linear Units
    Komatsu, Tatsuya
    Togami, Masahito
    Takahashi, Tsubasa
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 41 - 45
  • [24] Parallel Capsule Neural Networks for Sound Event Detection
    Liang, Kai-Wen
    Tseng, Yu-Hao
    Chang, Pao-Chi
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1933 - 1936
  • [25] Sound Event Detection with Perturbed Residual Recurrent Neural Network
    Yuan, Shuang
    Yang, Lidong
    Guo, Yong
    ELECTRONICS, 2023, 12 (18)
  • [26] Relational recurrent neural networks
    Santoro, Adam
    Faulkner, Ryan
    Raposo, David
    Rae, Jack
    Chrzanowski, Mike
    Weber, Theophane
    Wierstre, Daan
    Vinyals, Oriol
    Pascanu, Razvan
    Lillicrap, Timothy
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [27] Bidirectional recurrent neural networks for seismic event detection
    Birnie, Claire
    Hansteen, Fredrik
    GEOPHYSICS, 2022, 87 (03) : KS97 - KS111
  • [28] Prediction of Polyphonic Alarm Sound by Deep Neural Networks
    Kishimoto K.
    Takemura T.
    Sugiyama O.
    Kojima R.
    Yakami M.
    Nambu M.
    Fujii K.
    Kuroda T.
    Transactions of Japanese Society for Medical and Biological Engineering, 2022, 60 (01) : 8 - 15
  • [29] Technical Sound Event Classification Applying Recurrent and Convolutional Neural Networks
    Rieder, Constantin
    Germann, Markus
    Mezger, Samuel
    Scherer, Klaus
    PROCEEDINGS OF THE 1ST INTERNATIONAL CONFERENCE ON DEEP LEARNING THEORY AND APPLICATIONS (DELTA), 2020, : 84 - 88
  • [30] Duration-Controlled LSTM for Polyphonic Sound Event Detection
    Hayashi, Tomoki
    Watanabe, Shinji
    Toda, Tomoki
    Hori, Takaaki
    Le Roux, Jonathan
    Takeda, Kazuya
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (11) : 2059 - 2070