Audio Event Recognition Based On DBN Features From Multiple Filter-bank Representations

被引:0
|
作者
Guo, Feng [1 ]
Chen, Xiaoou [1 ]
Yang, Deshun [1 ]
机构
[1] Peking Univ, Inst Comp Sci & Technol, 128 Zhongguancun North St, Beijing 100871, Peoples R China
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the audio event classification or detection research field, the representation of the audio itself is important. Many researchers tried to apply Deep Belief Network (DBN) to learn new representations of the audio. The mel filter-bank feature, which is obtained based on mel scale, is commonly used as the low level representation of the audio in the pre-processing procedure of DBN. However, the mel bands used in mel filter-bank feature may not be sufficient for the comprehensive representation of the diverse audio events in the real world and then it will make it difficult for DBN to learn good audio features. In this paper, two steps are taken to explore and tackle the problem. In the first step, we conduct a comparison of the effects among different arrangements of frequency bands to DBN feature learning in the audio event recognition. Here the arrangements of frequency bands include mel bands, bark bands, linear bands and pyramid bands. In the second step, in order to utilize the different classification capabilities of the DBN features on different audio events, we adopt the Adaboost algorithm to fuse them. We conduct the experiments on real datasets collected from findsound website, and the results verifies that our proposed audio event classification system, which uses diverse features selected by Adaboost from all sets of DBN features, outperforms the one using only one kind of DBN feature set.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] AUDIO EVENT DETECTION BASED ON LAYERED SYMBOLIC SEQUENCE REPRESENTATIONS
    Chin, Michele Lai
    Burred, Juan Jose
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1953 - 1956
  • [42] A psychoacoustic model for audio coding based on a cochlear filter bank
    Baumgarte, F
    [J]. PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, : 139 - 142
  • [43] Filter-bank based digital sub-banding ASIC architecture for coherent optical receivers
    Nazarathy, Moshe
    Tolmachev, Alex
    [J]. NEXT-GENERATION OPTICAL COMMUNICATION: COMPONENTS, SUB-SYSTEMS, AND SYSTEMS II, 2013, 8647
  • [44] Visual -audio emotion recognition based on multi -task and ensemble learning with multiple features ?
    Hao, Man
    Cao, Wei-Hua
    Liu, Zhen-Tao
    Wu, Min
    Xiao, Peng
    [J]. NEUROCOMPUTING, 2020, 391 : 42 - 51
  • [45] DBN based multi-stream models for audio-visual speech recognition
    Gowdy, JN
    Subramanya, A
    Bartels, C
    Bilmes, J
    [J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 993 - 996
  • [46] Audio coding with signal adaptive block based filter bank switching
    Saleem, M.
    Ali, M. T.
    [J]. 2007 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, 2007, : 435 - +
  • [47] Iris recognition with tunable filter bank based feature
    Barpanda, Soubhagya Sankar
    Sa, Pankaj K.
    Marques, Oge
    Majhi, Banshidhar
    Bakshi, Sambit
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (06) : 7637 - 7674
  • [48] Iris recognition with tunable filter bank based feature
    Soubhagya Sankar Barpanda
    Pankaj K. Sa
    Oge Marques
    Banshidhar Majhi
    Sambit Bakshi
    [J]. Multimedia Tools and Applications, 2018, 77 : 7637 - 7674
  • [49] Audio Signal Recognition System Based On Vocal Features
    Albin, A. Jose
    Nandhitha, N. M.
    [J]. RESEARCH JOURNAL OF PHARMACEUTICAL BIOLOGICAL AND CHEMICAL SCIENCES, 2015, 6 (02): : 1006 - 1012
  • [50] DBN - Based learning for Arabic Handwritten Digit Recognition Using DCT Features
    AlKhateeb, Jawad H.
    Alseid, Marwan
    [J]. 2014 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSIT), 2014, : 222 - 226