COMPACT BILINEAR DEEP FEATURES FOR ENVIRONMENTAL SOUND RECOGNITION

被引:0
|
作者
Demir, Fatih [1 ]
Sengur, Abdulkadir [1 ]
Lu, Hao [2 ]
Amiriparian, Shahin [3 ,4 ]
Cummins, Nicholas [3 ]
Schuller, Bjoern [3 ,5 ]
机构
[1] Firat Univ, Technol Elazig, Elect & Elect Engn Dept, Elazig, Turkey
[2] Huazhong Univ Sci & Technol, Natl Key Lab Sci & Technol Multispectral Informat, Sch Automat, Wuhan 430074, Hubei, Peoples R China
[3] Univ Augsburg, ZD B Chair Embedded Intelligence Hlth Care & Well, Augsburg, Germany
[4] Tech Univ Munich, Machine Intelligence & Signal Proc Grp, Munich, Germany
[5] Imperial Coll London, GLAM, London, England
基金
欧盟地平线“2020”;
关键词
Environmental sound classification; deep spectrum features; convolutional neural networks; compact bilinear pooling; CLASSIFICATION;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Environmental sound recognition (ESR) has extensive various civilian and military applications. Existing ESR methods generally tackle this problem by employing various signal processing and machine learning methods. Herein, an ESR paradigm based on feature extraction from pre-trained deep convolutional neural networks (CNN), the derivation of higher-order statistics by compact bilinear pooling and normalisation. In particular, we consider two deep ImageNet architectures for deep feature extraction, and the Random Maclaurin (RM) to produce the compact bilinear features. A support vector machine (SVM) with homogeneous mapping is used in the classification stage. Two publicly available environmental sound datasets are used to verify the efficacy of the approach namely, ESC-50 and ESC-10. We compare the proposed method with various previous state-of-the-art methods. Presented results indicate the suitability of the higher-order statistics of DEEP SPECTRUM representations for ESR classification tasks.
引用
收藏
页数:5
相关论文
共 50 条
  • [1] Environmental Sound Classification Method Based on Compact Bilinear Attention Network
    Dong, Shaojiang
    Xia, Zhengfu
    Cai, Weiwei
    [J]. Beijing Youdian Daxue Xuebao/Journal of Beijing University of Posts and Telecommunications, 2023, 46 (06): : 102 - 107
  • [2] Environmental sound classification based on improved compact bilinear attention network
    Dong, Shaojiang
    Xia, Zhengfu
    Pan, Xuejiao
    Yu, Tengwei
    [J]. DIGITAL SIGNAL PROCESSING, 2023, 141
  • [3] Environmental sound recognition with CELP-based features
    Tsau, EnShuo
    Kim, Seung-Hwan
    Kuo, C-C Jay
    [J]. 2011 10TH INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2011,
  • [4] RECURRENCE QUANTIFICATION ANALYSIS FEATURES FOR ENVIRONMENTAL SOUND RECOGNITION
    Roma, Gerard
    Nogueira, Waldo
    Herrera, Perfecto
    [J]. 2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2013,
  • [5] Environmental Sound Recognition With Time-Frequency Audio Features
    Chu, Selina
    Narayanan, Shrikanth
    Kuo, C. -C. Jay
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (06): : 1142 - 1158
  • [6] Environmental sound recognition using MP-BASED features
    Chu, Selina
    Narayanan, Shrikanth
    Kuo, C. -C. Jay
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 1 - +
  • [7] Fusion of acoustic and deep features for pig cough sound recognition
    Shen, Weizheng
    Ji, Nan
    Yin, Yanling
    Dai, Baisheng
    Tu, Ding
    Sun, Baihui
    Hou, Handan
    Kou, Shengli
    Zhao, Yize
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2022, 197
  • [8] The Application and Improvement of Deep Neural Networks in Environmental Sound Recognition
    Lin, Yu-Kai
    Su, Mu-Chun
    Hsieh, Yi-Zeng
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [9] SOUND TRANSMISSION IN A CHANNEL WITH BILINEAR SOUND SPEED AND ENVIRONMENTAL VARIATIONS
    BAER, RN
    JACOBSON, MJ
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1973, 54 (01): : 80 - 91
  • [10] DEEP BOTTLENECK FEATURES AND SOUND-DEPENDENT I-VECTORS FOR SIMULTANEOUS RECOGNITION OF SPEECH AND ENVIRONMENTAL SOUNDS
    Sakti, Sakriani
    Kawanishi, Seiji
    Neubig, Graham
    Yoshino, Koichiro
    Nakamura, Satoshi
    [J]. 2016 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2016), 2016, : 35 - 42