Enhancing Feature Representation for Anomaly Detection via Local-and-Global Temporal Relations and a Multi-stage Memory

被引:0
|
作者
Li, Xuan [1 ]
Ma, Ding [1 ]
Wu, Xiangqian [1 ]
机构
[1] Harbin Inst Technol, Fac Comp, Harbin, Peoples R China
基金
黑龙江省自然科学基金;
关键词
Video anomaly detection; Weak supervision; Feature representation enhancing; Temporal relations; Multi-stage memory;
D O I
10.1007/978-981-99-8537-1_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly supervised video anomaly detection is a challenging task because frame-level labels are not accessible at the training time. Effectively tackling this task necessitates models to learn discriminative feature representation. To address this challenge, we propose a multi-stage memory-augmented feature discrimination learning (MMFDL) method. The first stage obtains the preliminary abnormal probabilities of clip features. In the second stage, an easy normal pattern memory (ENPM) are proposed to store normal patterns with low abnormal probabilities. In the last stage, we bring clip features with high abnormal probabilities in normal videos close to ENPM and away from the clip features with high probabilities of being abnormal in abnormal videos to make models learn more discriminative features for anomaly detection. Furthermore, we propose a local-and-global temporal relations modeling (LGTRM) module to enhance clip features by aggregating local and global contexts. Our LGTRM module can be divided into two subnetworks: DW-Net and TF-Net. DW-Net integrates the current clip feature with its adjacent clip features to capture local-range temporal dependencies. TF-Net utilizes the multi-head self-attention mechanism of the transformer to capture global-range temporal dependencies. Experiments on two datasets demonstrate that our method outperforms state-of-the-art approaches. The code is available at https://github.com/xuanli01/PRCV347.
引用
收藏
页码:121 / 133
页数:13
相关论文
共 10 条
  • [1] Multi-stage temporal representation learning via global and local perspectives for real-time speech enhancement
    Chau, Hoang Ngoc
    Linh, Nguyen Thi Nhat
    Doan, Tuan Kiet
    Nguyen, Quoc Cuong
    [J]. APPLIED ACOUSTICS, 2024, 223
  • [2] Attention Guided Food Recognition via Multi-Stage Local Feature Fusion
    Deng, Gonghui
    Wu, Dunzhi
    Chen, Weizhen
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (02): : 1985 - 2003
  • [3] Gait Recognition via Effective Global-Local Feature Representation and Local Temporal Aggregation
    Lin, Beibei
    Zhang, Shunli
    Yu, Xin
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 14628 - 14636
  • [4] Multi-Stage Multi-Scale Local Feature Fusion for Infrared Small Target Detection
    Wang, Yahui
    Tian, Yan
    Liu, Jijun
    Xu, Yiping
    [J]. REMOTE SENSING, 2023, 15 (18)
  • [5] Global–local multi-stage temporal convolutional network for cataract surgery phase recognition
    Lixin Fang
    Lei Mou
    Yuanyuan Gu
    Yan Hu
    Bang Chen
    Xu Chen
    Yang Wang
    Jiang Liu
    Yitian Zhao
    [J]. BioMedical Engineering OnLine, 21
  • [6] Global-local multi-stage temporal convolutional network for cataract surgery phase recognition
    Fang, Lixin
    Mou, Lei
    Gu, Yuanyuan
    Hu, Yan
    Chen, Bang
    Chen, Xu
    Wang, Yang
    Liu, Jiang
    Zhao, Yitian
    [J]. BIOMEDICAL ENGINEERING ONLINE, 2022, 21 (01)
  • [7] Local-aware spatio-temporal attention network with multi-stage feature fusion for human action recognition
    Hou, Yaqing
    Yu, Hua
    Zhou, Dongsheng
    Wang, Pengfei
    Ge, Hongwei
    Zhang, Jianxin
    Zhang, Qiang
    [J]. NEURAL COMPUTING & APPLICATIONS, 2021, 33 (23): : 16439 - 16450
  • [8] Local-aware spatio-temporal attention network with multi-stage feature fusion for human action recognition
    Yaqing Hou
    Hua Yu
    Dongsheng Zhou
    Pengfei Wang
    Hongwei Ge
    Jianxin Zhang
    Qiang Zhang
    [J]. Neural Computing and Applications, 2021, 33 : 16439 - 16450
  • [9] Automatic detection of fish sounds based on multi-stage classification including logistic regression via adaptive feature weighting
    Harakawa, Ryosuke
    Ogawa, Takahiro
    Haseyama, Miki
    Akamatsu, Tomonari
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2018, 144 (05): : 2709 - 2718
  • [10] Automatic detection of fish sounds based on multi-stage classification including logistic regression via adaptive feature weighting
    [J]. Harakawa, Ryosuke (harakawa@lmd.ist.hokudai.ac.jp), 1600, Acoustical Society of America (144):