Robust Environmental Sound Recognition With Sparse Key-Point Encoding and Efficient Multispike Learning

被引:14
|
作者
Yu, Qiang [1 ]
Yao, Yanli [1 ]
Wang, Longbiao [1 ]
Tang, Huajin [2 ]
Dang, Jianwu [1 ]
Tan, Kay Chen [3 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin Key Lab Cognit Comp & Applicat, Tianjin 300350, Peoples R China
[2] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 610065, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Encoding; Task analysis; Hidden Markov models; Neurons; Biological neural networks; Mel frequency cepstral coefficient; Biological information theory; Brain-like processing; feature extraction; multispike learning; neuromorphic computing; robust sound recognition; spike encoding; spiking neural networks (SNNs); AUTOMATIC SPEECH RECOGNITION; EVENT CLASSIFICATION; FEATURES; NETWORKS; NEURON;
D O I
10.1109/TNNLS.2020.2978764
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The capability for environmental sound recognition (ESR) can determine the fitness of individuals in a way to avoid dangers or pursue opportunities when critical sound events occur. It still remains mysterious about the fundamental principles of biological systems that result in such a remarkable ability. Additionally, the practical importance of ESR has attracted an increasing amount of research attention, but the chaotic and nonstationary difficulties continue to make it a challenging task. In this article, we propose a spike-based framework from a more brain-like perspective for the ESR task. Our framework is a unifying system with consistent integration of three major functional parts which are sparse encoding, efficient learning, and robust readout. We first introduce a simple sparse encoding, where key points are used for feature representation, and demonstrate its generalization to both spike- and nonspike-based systems. Then, we evaluate the learning properties of different learning rules in detail with our contributions being added for improvements. Our results highlight the advantages of multispike learning, providing a selection reference for various spike-based developments. Finally, we combine the multispike readout with the other parts to form a system for ESR. Experimental results show that our framework performs the best as compared to other baseline approaches. In addition, we show that our spike-based framework has several advantageous characteristics including early decision making, small dataset acquiring, and ongoing dynamic processing. Our framework is the first attempt to apply the multispike characteristic of nervous neurons to ESR. The outstanding performance of our approach would potentially contribute to draw more research efforts to push the boundaries of spike-based paradigm to a new horizon.
引用
收藏
页码:625 / 638
页数:14
相关论文
共 50 条
  • [1] Temporal Encoding and Multispike Learning Framework for Efficient Recognition of Visual Patterns
    Yu, Qiang
    Song, Shiming
    Ma, Chenxiang
    Wei, Jianguo
    Chen, Shengyong
    Tan, Kay Chen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3387 - 3399
  • [2] SCK: A SPARSE CODING BASED KEY-POINT DETECTOR
    Thanh Hong-Phuoc
    He, Yifeng
    Guan, Ling
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 3768 - 3772
  • [3] A Novel Key-Point Detector Based on Sparse Coding
    Thanh Hong-Phuoc
    Guan, Ling
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 747 - 756
  • [4] A Scale and Rotational Invariant Key-point Detector based on Sparse Coding
    Thanh Phuoc Hong
    Guan, Ling
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2021, 12 (03)
  • [5] Key-point Detection based Fast CU Decision for HEVC Intra Encoding
    Xu, Zhe
    Min, Biao
    Cheung, Ray C. C.
    INTERNATIONAL JOURNAL OF ELECTRONICS AND TELECOMMUNICATIONS, 2018, 64 (03) : 321 - 327
  • [6] Spike-based encoding and learning of spectrum features for robust sound recognition
    Xiao, Rong
    Tang, Huajin
    Gu, Pengjie
    Xu, Xiaoliang
    NEUROCOMPUTING, 2018, 313 : 65 - 73
  • [7] Approximated Scale Space for Efficient and Accurate SIFT Key-Point Detection
    Wang, Ying
    Liu, Yiguang
    Xu, Zhenyu
    Zheng, Yunan
    Hong, Weijie
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SENSING AND IMAGING, 2018, 2019, 606 : 31 - 40
  • [8] Modified SIFT descriptor and key-point matching for fast and robust image mosaic
    何玉青
    王雪
    王思远
    刘明奇
    诸加丹
    金伟其
    JournalofBeijingInstituteofTechnology, 2016, 25 (04) : 562 - 570
  • [9] Robust descriptor for key-point detection and matching in color images with radial distortion
    Zou, Zesen
    Wang, Rui
    Zou, Jialing
    Huang, Ran
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (02)
  • [10] The Advisable Technology of Key-Point Detection and Expression Recognition for an Intelligent Class System
    Zhao, Yusheng
    Yan, Haotian
    Wang, Zhaoqing
    2018 INTERNATIONAL SYMPOSIUM ON POWER ELECTRONICS AND CONTROL ENGINEERING (ISPECE 2018), 2019, 1187