Polyphonic Sound Event Detection Using Modified Recurrent Temporal Pyramid Neural Network

被引:0
|
作者
Venkatesh, Spoorthy [1 ]
Koolagudi, Shashidhar G. [1 ]
机构
[1] Natl Inst Technol Karnataka, Surathkal 575025, India
关键词
Polyphonic Sound Event Detection (SED); Constant Q-Transform (CQT); Deep learning; Modified Recurrent Temporal Pyramid Network; CLASSIFICATION; SCENES;
D O I
10.1007/978-3-031-58181-6_47
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel approach to performing polyphonic Sound Event Detection (SED) is presented. A new deep learning architecture named "Modified Recurrent Temporal Pyramid Neural Network (MR-TPNN)" is introduced. The input features fed to the network are spectrograms generated from Constant Q-Transform (CQT). CQT spectrograms provided better sound event information in the audio recording than the Short Time Fourier Transform (STFT) and Fast Fourier Transform (FFT) methods. The temporal information is an essential factor for detecting the onset and offset of events in an audio recording. Capturing the temporal information is ensured by fusing Temporal pyramids and Bi-directional long short-term memory (LSTM) recurrent layers in deep learning architecture. Extensive experiments are carried out on three benchmark datasets, and the results of the proposed method are superior to those of the existing polyphonic SED systems.
引用
收藏
页码:554 / 564
页数:11
相关论文
共 50 条
  • [21] Polyphonic Sound Event Detection Using Mel-Pseudo Constant Q-Transform and Deep Neural Network
    Spoorthy, V
    Koolagudi, Shashidhar G.
    IETE JOURNAL OF RESEARCH, 2024, 70 (05) : 5031 - 5043
  • [22] Abnormal Event Detection using Recurrent Neural Network
    Zhou, Xu-gang
    Zhang, Li-qing
    2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATIONS (CSA), 2015, : 222 - 226
  • [23] Polyphonic Sound Event Detection Using Temporal-Frequency Attention and Feature Space Attention
    Jin, Ye
    Wang, Mei
    Luo, Liyan
    Zhao, Dinghao
    Liu, Zhanqi
    SENSORS, 2022, 22 (18)
  • [24] Complex Activity Recognition Using Polyphonic Sound Event Detection
    Kang, Jaewoong
    Kim, Jooyeong
    Kim, Kunyoung
    Sohn, Mye
    INNOVATIVE MOBILE AND INTERNET SERVICES IN UBIQUITOUS COMPUTING, IMIS-2018, 2019, 773 : 675 - 684
  • [25] Event Specific Attention for Polyphonic Sound Event Detection
    Sundar, Harshavardhan
    Sun, Ming
    Wang, Chao
    INTERSPEECH 2021, 2021, : 566 - 570
  • [26] Minimally Supervised Sound Event Detection Using a Neural Network
    Agarwal, Aditya
    Quadri, Syed Munawwar
    Murthy, Savitha
    Sitaram, Dinkar
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 2495 - 2500
  • [27] A Deep Learning Based Sound Event Location and Detection Algorithm Using Convolutional Recurrent Neural Network
    Zhu, Hongxiang
    Yan, Jun
    2022 INTERNATIONAL CONFERENCE ON COMPUTER, INFORMATION AND TELECOMMUNICATION SYSTEMS, CITS, 2022, : 25 - 30
  • [28] A Transfer Learning Based Feature Extractor for Polyphonic Sound Event Detection Using Connectionist Temporal Classification
    Wang, Yun
    Metze, Florian
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3097 - 3101
  • [29] Sound Event Detection Using Attention and Aggregation-Based Feature Pyramid Network
    Kim, Ji Won
    Lee, Geon Woo
    Kim, Hong Kook
    Kim, Nam Kyun
    2022 27TH ASIA PACIFIC CONFERENCE ON COMMUNICATIONS (APCC 2022): CREATING INNOVATIVE COMMUNICATION TECHNOLOGIES FOR POST-PANDEMIC ERA, 2022, : 496 - 497
  • [30] Sound Event Detection Using EfficientNet-B2 with an Attentional Pyramid Network
    Kim, Ji Won
    Lee, Geon Woo
    Park, Chang-Soo
    Kim, Hong Kook
    2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE, 2023,