Audio Event Recognition Based On DBN Features From Multiple Filter-bank Representations

被引：0

作者：

Guo, Feng ^{[1
]}

Chen, Xiaoou ^{[1
]}

Yang, Deshun ^{[1
]}

机构：

[1] Peking Univ, Inst Comp Sci & Technol, 128 Zhongguancun North St, Beijing 100871, Peoples R China

来源：

2015 IEEE 17TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2015年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In the audio event classification or detection research field, the representation of the audio itself is important. Many researchers tried to apply Deep Belief Network (DBN) to learn new representations of the audio. The mel filter-bank feature, which is obtained based on mel scale, is commonly used as the low level representation of the audio in the pre-processing procedure of DBN. However, the mel bands used in mel filter-bank feature may not be sufficient for the comprehensive representation of the diverse audio events in the real world and then it will make it difficult for DBN to learn good audio features. In this paper, two steps are taken to explore and tackle the problem. In the first step, we conduct a comparison of the effects among different arrangements of frequency bands to DBN feature learning in the audio event recognition. Here the arrangements of frequency bands include mel bands, bark bands, linear bands and pyramid bands. In the second step, in order to utilize the different classification capabilities of the DBN features on different audio events, we adopt the Adaboost algorithm to fuse them. We conduct the experiments on real datasets collected from findsound website, and the results verifies that our proposed audio event classification system, which uses diverse features selected by Adaboost from all sets of DBN features, outperforms the one using only one kind of DBN feature set.

引用

页数：6

共 50 条

[41] AUDIO EVENT DETECTION BASED ON LAYERED SYMBOLIC SEQUENCE REPRESENTATIONS
Chin, Michele Lai
Burred, Juan Jose
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 1953 - 1956
[42] A psychoacoustic model for audio coding based on a cochlear filter bank
Baumgarte, F
[J]. PROCEEDINGS OF THE 2001 IEEE WORKSHOP ON THE APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2001, : 139 - 142
[43] Filter-bank based digital sub-banding ASIC architecture for coherent optical receivers
Nazarathy, Moshe
Tolmachev, Alex
[J]. NEXT-GENERATION OPTICAL COMMUNICATION: COMPONENTS, SUB-SYSTEMS, AND SYSTEMS II, 2013, 8647
[44] Visual -audio emotion recognition based on multi -task and ensemble learning with multiple features ?
Hao, Man
Cao, Wei-Hua
Liu, Zhen-Tao
Wu, Min
Xiao, Peng
[J]. NEUROCOMPUTING, 2020, 391 : 42 - 51
[45] DBN based multi-stream models for audio-visual speech recognition
Gowdy, JN
Subramanya, A
Bartels, C
Bilmes, J
[J]. 2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 993 - 996
[46] Audio coding with signal adaptive block based filter bank switching
Saleem, M.
Ali, M. T.
[J]. 2007 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, 2007, : 435 - +
[47] Iris recognition with tunable filter bank based feature
Barpanda, Soubhagya Sankar
Sa, Pankaj K.
Marques, Oge
Majhi, Banshidhar
Bakshi, Sambit
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (06) : 7637 - 7674
[48] Iris recognition with tunable filter bank based feature
Soubhagya Sankar Barpanda
Pankaj K. Sa
Oge Marques
Banshidhar Majhi
Sambit Bakshi
[J]. Multimedia Tools and Applications, 2018, 77 : 7637 - 7674
[49] Audio Signal Recognition System Based On Vocal Features
Albin, A. Jose
Nandhitha, N. M.
[J]. RESEARCH JOURNAL OF PHARMACEUTICAL BIOLOGICAL AND CHEMICAL SCIENCES, 2015, 6 (02): : 1006 - 1012
[50] DBN - Based learning for Arabic Handwritten Digit Recognition Using DCT Features
AlKhateeb, Jawad H.
Alseid, Marwan
[J]. 2014 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY (CSIT), 2014, : 222 - 226

← 1 2 3 4 5 →