POLYPHONIC SOUND EVENT DETECTION USING TRANSPOSED CONVOLUTIONAL RECURRENT NEURAL NETWORK

被引:0
|
作者
Chatterjee, Chandra Churh [1 ]
Mulimani, Manjunath [2 ]
Koolagudi, Shashidhar G. [3 ]
机构
[1] Jalpaiguri Govt Engn Coll, Dept CSE, Jalpaiguri 735102, India
[2] Manipal Acad Higher Educ, Manipal Inst Technol, Dept CSE, Manipal 576104, India
[3] Natl Inst Technol Karnataka, Dept CSE, Surathkal 575025, India
关键词
Sound Event Detection (SED); Deep Neural Networks (DNN); Convolution Neural Networks (CNN); Recurrent Neural Networks (RNN); Transposed CNN (TCNN); Instantaneous Frequency spectrogram (IFgram);
D O I
10.1109/icassp40776.2020.9054628
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper we propose a Transposed Convolutional Recurrent Neural Network (TCRNN) architecture for polyphonic sound event recognition. Transposed convolution layer, which caries out a regular convolution operation but reverts the spatial transformation and it is combined with a bidirectional Recurrent Neural Network (RNN) to get TCRNN. Instead of the traditional mel spectrogram features, the proposed methodology incorporates mel-IFgram (Instantaneous Frequency spectrogram) features. The performance of the proposed approach is evaluated on sound events of publicly available TUT-SED 2016 and Joint sound scene and polyphonic sound event recognition datasets. Results show that the proposed approach outperforms state-of-the-art methods.
引用
收藏
页码:661 / 665
页数:5
相关论文
共 50 条
  • [1] Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
    Cakir, Emre
    Parascandolo, Giambattista
    Heittola, Toni
    Huttunen, Heikki
    Virtanen, Tuomas
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1291 - 1303
  • [2] SOUND EVENT DETECTION USING SPATIAL FEATURES AND CONVOLUTIONAL RECURRENT NEURAL NETWORK
    Adavanne, Sharath
    Pertila, Pasi
    Virtanen, Tuomas
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 771 - 775
  • [3] Polyphonic Sound Event Detection Based on Residual Convolutional Recurrent Neural Network With Semi-Supervised Loss Function
    Kim, Nam Kyun
    Kim, Hong Kook
    [J]. IEEE ACCESS, 2021, 9 (09): : 7564 - 7575
  • [4] Relational recurrent neural networks for polyphonic sound event detection
    Ma, Junbo
    Wang, Ruili
    Ji, Wanting
    Zheng, Hao
    Zhu, En
    Yin, Jianping
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (20) : 29509 - 29527
  • [5] Relational recurrent neural networks for polyphonic sound event detection
    Junbo Ma
    Ruili Wang
    Wanting Ji
    Hao Zheng
    En Zhu
    Jianping Yin
    [J]. Multimedia Tools and Applications, 2019, 78 : 29509 - 29527
  • [6] Polyphonic Sound Event Detection by Using Capsule Neural Networks
    Vesperini, Fabio
    Gabrielli, Leonardo
    Principi, Emanuele
    Squartini, Stefano
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2019, 13 (02) : 310 - 322
  • [7] RECURRENT NEURAL NETWORKS FOR POLYPHONIC SOUND EVENT DETECTION IN REAL LIFE RECORDINGS
    Parascandolo, Giambattista
    Huttunen, Heikki
    Virtanen, Tuomas
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 6440 - 6444
  • [8] Convolutional Neural Networks with Multi-task Loss for Polyphonic Sound Event Detection
    Liu, Huang
    Wang, Xiu
    Guan, Fa-Qian
    Hu, Jin-Sen
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND APPLICATION ENGINEERING (CSAE2018), 2018,
  • [9] Filterbank Learning for Deep Neural Network Based Polyphonic Sound Event Detection
    Cakir, Emre
    Ozan, Ezgi Can
    Virtanen, Tuomas
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3399 - 3406
  • [10] Sound Event Detection in Cowshed using Synthetic Data and Convolutional Neural Network
    Pandeya, Yagya Raj
    Bhattarai, Bhuwan
    Lee, Joonwhoan
    [J]. 11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 273 - 276