Acoustic Scene Recognition Based on Convolutional Neural Networks

被引:0
|
作者
Sun, Fengjiao [1 ]
Wang, Mingjiang [1 ]
Xu, Qihang [1 ]
Xuan, Xiaogung [1 ]
Zhang, Xin [1 ]
机构
[1] Harbin Inst Technol, Elect & Informat Engn Coll, Shenzhen, Peoples R China
关键词
Audio scene recognition; Log-mel spectrum; Convolutional neural network; Softmax;
D O I
10.1109/siprocess.2019.8868402
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Audio scene recognition is a process of automatically determining the scene around the device by extracting the features of scene audio signals. It is more about the perception and understanding of non-speech signals, and has a profound guiding significance for the machine to make more intelligent choices. To solve this problem, this paper proposes an audio scene recognition method based on convolutional neural network. Firstly, short-time Fourier transform and Mel filter hank are used to transform the audio signal into log-mel spectrum. Then, log-mel fragments are trained by using CNN neural network, and the features are extracted. Finally, softmax was used to identify and classify CNN features. This method is used to test the data set of IEEE DCASE 2018. Experimental results show that this method has a high recognition rate.
引用
收藏
页码:122 / 126
页数:5
相关论文
共 50 条
  • [11] Learning Scene Gist with Convolutional Neural Networks to Improve Object Recognition
    Wu, Kevin
    Wu, Eric
    Kreiman, Gabriel
    2018 52ND ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2018,
  • [12] Object-Scene Convolutional Neural Networks for Event Recognition in Images
    Wang, Limin
    Wang, Zhe
    Du, Wenbin
    Qiao, Yu
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2015,
  • [13] Acoustic-based LEGO recognition using attention-based convolutional neural networks
    Tran, Van-Thuan
    Wu, Chia-Yang
    Tsai, Wei-Ho
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (01)
  • [14] Acoustic-based LEGO recognition using attention-based convolutional neural networks
    Van-Thuan Tran
    Chia-Yang Wu
    Wei-Ho Tsai
    Artificial Intelligence Review, 2024, 57
  • [15] Acoustic Scene Classification Using Spatial Pyramid Pooling With Convolutional Neural Networks
    Basbug, Ahmet Melih
    Sert, Mustafa
    2019 13TH IEEE INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2019, : 128 - 131
  • [16] The Receptive Field as a Regularizer in Deep Convolutional Neural Networks for Acoustic Scene Classification
    Koutini, Khaled
    Eghbal-zadeh, Hamid
    Dorfer, Matthias
    Widmer, Gerhard
    2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,
  • [17] GAIT RECOGNITION BASED ON CONVOLUTIONAL NEURAL NETWORKS
    Sokolova, A.
    Konushin, A.
    INTERNATIONAL WORKSHOP PHOTOGRAMMETRIC AND COMPUTER VISION TECHNIQUES FOR VIDEO SURVEILLANCE, BIOMETRICS AND BIOMEDICINE, 2017, 42-2 (W4): : 207 - 212
  • [18] Speech Recognition Based on Convolutional Neural Networks
    Du Guiming
    Wang Xia
    Wang Guangyan
    Zhang Yan
    Li Dan
    2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2016, : 708 - 711
  • [19] Dance Art Scene Classification Based on Convolutional Neural Networks
    Li, Le
    SCIENTIFIC PROGRAMMING, 2022, 2022
  • [20] SAR Target Recognition in Large Scene Images via Region-Based Convolutional Neural Networks
    Cui, Zongyong
    Dang, Sihang
    Cao, Zongjie
    Wang, Sifei
    Liu, Nengyuan
    REMOTE SENSING, 2018, 10 (05)