Decouple and Resolve: Transformer-Based Models for Online Anomaly Detection From Weakly Labeled Videos

被引:10
|
作者
Liu, Tianshan [1 ]
Zhang, Cong [1 ]
Lam, Kin-Man [1 ,2 ]
Kong, Jun [3 ]
机构
[1] Hong Kong Polytech Univ, Dept Elect & Informat Engn, Hong Kong, Peoples R China
[2] Ctr Adv Reliabil & Safety, Hong Kong, Peoples R China
[3] Jiangnan Univ, Minist Educ, Key Lab Adv Proc Control Light Ind, Wuxi 214122, Peoples R China
关键词
Videos; Task analysis; Transformers; Anomaly detection; Proposals; Training; Annotations; Online video anomaly detection; weakly supervised learning; multi-task learning; long-short-term context;
D O I
10.1109/TIFS.2022.3216479
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
As one of the vital topics in intelligent surveillance, weakly supervised online video anomaly detection (WS-OVAD) aims to identify the ongoing anomalous events moment-to-moment in streaming videos, trained with only video-level annotations. Previous studies tended to utilize a unified single-stage framework, which struggled to simultaneously address the issues of online constraints and weakly supervised settings. To solve this dilemma, in this paper, we propose a two-stage-based framework, namely "decouple and resolve " (DAR), which consists of two modules, i.e., temporal proposal producer (TPP) and online anomaly localizer (OAL). With the supervision of video-level binary labels, the TPP module targets fully exploiting hierarchical temporal relations among snippets for generating precise snippet-level pseudo-labels. Then, given fine-grained supervisory signals produced by TPP, the Transformer-based OAL module is trained to aggregate both the useful cues retrieved from historical observations and anticipated future semantics, for making predictions at the current time step. Both the TPP and OAL modules are jointly trained to share the beneficial knowledge in a multi-task learning paradigm. Extensive experimental results on three public data sets validate the superior performance of the proposed DAR framework over the competing methods.
引用
收藏
页码:15 / 28
页数:14
相关论文
共 50 条
  • [1] Transformer-Based Fire Detection in Videos
    Mardani, Konstantina
    Vretos, Nicholas
    Daras, Petros
    [J]. SENSORS, 2023, 23 (06)
  • [2] Transformer-based fall detection in videos
    Nunez-Marcos, Adrian
    Arganda-Carreras, Ignacio
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 132
  • [3] A Transformer-Based GAN for Anomaly Detection
    Yang, Caiyin
    Lan, Shiyong
    Huangl, Weikang
    Wang, Wenwu
    Liul, Guoliang
    Yang, Hongyu
    Ma, Wei
    Li, Piaoyang
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT II, 2022, 13530 : 345 - 357
  • [4] Transformer-based Spatio-Temporal Unsupervised Traffic Anomaly Detection in Aerial Videos
    Tran T.M.
    Bui D.C.
    Nguyen T.V.
    Nguyen K.
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (09) : 1 - 1
  • [5] Vision Transformer-Based Tailing Detection in Videos
    Lee, Jaewoo
    Lee, Sungjun
    Cho, Wonki
    Siddiqui, Zahid Ali
    Park, Unsang
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (24):
  • [6] Unsupervised Transformer-Based Anomaly Detection in ECG Signals
    Alamr, Abrar
    Artoli, Abdelmonim
    [J]. ALGORITHMS, 2023, 16 (03)
  • [7] M2VAD: Multiview multimodality transformer-based weakly supervised video anomaly detection
    Paulraj, Shalmiya
    Vairavasundaram, Subramaniyaswamy
    [J]. IMAGE AND VISION COMPUTING, 2024, 149
  • [8] Spatio-temporal graph-based CNNs for anomaly detection in weakly-labeled videos
    Mu, Huiyu
    Sun, Ruizhi
    Wang, Miao
    Chen, Zeqiu
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)
  • [9] Transformer-based Models for Arabic Online Handwriting Recognition
    Alwajih, Fakhraddin
    Badr, Eman
    Abdou, Sherif
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (05) : 898 - 905
  • [10] Transformer-based models for multimodal irony detection
    Tomás D.
    Ortega-Bueno R.
    Zhang G.
    Rosso P.
    Schifanella R.
    [J]. Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (6) : 7399 - 7410