Sparse Representation with Temporal Max-Smoothing for Acoustic Event Detection

被引:0
|
作者
Lu, Xugang [1 ]
Shen, Peng [1 ]
Tsao, Yu [2 ]
Hori, Chiori [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Koganei, Tokyo, Japan
[2] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei, Taiwan
关键词
Feature learning; matching pursuit; temporal max-smoothing; acoustic event detection;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In order to incorporate long temporal-frequency structure for acoustic event detection, we have proposed a spectral patch based learning and representation method. The learned spectral patches were regarded as acoustic words which were further used in sparse encoding for acoustic feature representation and modeling. In our previous study, during feature encoding stage, each spectral patch was encoded independently. Considering that spectral patches taken from a time sequence should keep similar representations for neighboring patches after encoding, in this study, we propose to enhance the temporal correlation of feature representation using a temporal max-smoothing algorithm. The max-smoothing tries to pick up the maximum response in a local time window as the representative feature for detection task. We tested the new feature for automatic detection of acoustic events which were selected from lecture audio data. Experimental results showed that the temporal max-smoothing significantly improved the performance.
引用
收藏
页码:1176 / 1180
页数:5
相关论文
共 50 条
  • [1] SPARSE REPRESENTATION BASED ON A BAG OF SPECTRAL EXEMPLARS FOR ACOUSTIC EVENT DETECTION
    Lu, Xugang
    Tsao, Yu
    Matsuda, Shigeki
    Hori, Chiori
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] Consistent Sparse Representation for Abnormal Event Detection
    Zhang, Zhong
    Liu, Shuang
    Zhang, Zhiwei
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (10): : 1866 - 1870
  • [3] Temporal Acoustic Words for Online Acoustic Event Detection
    Grzeszick, Rene
    Plinge, Axel
    Fink, Gernot A.
    [J]. PATTERN RECOGNITION, GCPR 2015, 2015, 9358 : 142 - 153
  • [4] On Learning Disentangled Representation for Acoustic Event Detection
    Gao, Lijian
    Mao, Qirong
    Dong, Ming
    Jing, Yu
    Chinnam, Ratna
    [J]. PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 2006 - 2014
  • [5] Abnormal Event Detection Using Local Sparse Representation
    Ren, Huamin
    Moeslund, Thomas B.
    [J]. 2014 11TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2014, : 125 - 130
  • [6] Abnormal event detection in crowded scenes using sparse representation
    Cong, Yang
    Yuan, Junsong
    Liu, Ji
    [J]. PATTERN RECOGNITION, 2013, 46 (07) : 1851 - 1864
  • [7] Spectral Patch Based Sparse Coding for Acoustic Event Detection
    Lu, Xugang
    Tsao, Yu
    Shen, Peng
    Hori, Chiori
    [J]. 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 317 - +
  • [8] Temporal attentive pooling for acoustic event detection ocr
    Lu, Xugang
    Shen, Peng
    Li, Sheng
    Tsao, Yu
    Kawai, Hisashi
    [J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 1354 - 1357
  • [9] Temporal representation in spike detection of sparse personal identity streams
    Phua, C
    Lee, V
    Gayler, R
    Smith, K
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS, 2006, 3917 : 115 - 126
  • [10] Event Sparse Net: Sparse Dynamic Graph Multi-representation Learning with Temporal Attention for Event-Based Data
    Li, Dan
    Huang, Teng
    Hong, Jie
    Hong, Yile
    Wang, Jiaqi
    Wang, Zhen
    Zhang, Xi
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IX, 2024, 14433 : 208 - 219