Enhancing Video Anomaly Detection Using a Transformer Spatiotemporal Attention Unsupervised Framework for Large Datasets

被引:2
|
作者
Habeb, Mohamed H. [1 ]
Salama, May [1 ]
Elrefaei, Lamiaa A. [1 ]
机构
[1] Benha Univ, Fac Engn Shoubra, Elect Engn Dept, Cairo 11629, Egypt
关键词
video anomaly detection; unsupervised learning; spatiotemporal modeling; large datasets; LOCALIZATION; RECOGNITION; HISTOGRAMS; EXTRACTION;
D O I
10.3390/a17070286
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work introduces an unsupervised framework for video anomaly detection, leveraging a hybrid deep learning model that combines a vision transformer (ViT) with a convolutional spatiotemporal relationship (STR) attention block. The proposed model addresses the challenges of anomaly detection in video surveillance by capturing both local and global relationships within video frames, a task that traditional convolutional neural networks (CNNs) often struggle with due to their localized field of view. We have utilized a pre-trained ViT as an encoder for feature extraction, which is then processed by the STR attention block to enhance the detection of spatiotemporal relationships among objects in videos. The novelty of this work is utilizing the ViT with the STR attention to detect video anomalies effectively in large and heterogeneous datasets, an important thing given the diverse environments and scenarios encountered in real-world surveillance. The framework was evaluated on three benchmark datasets, i.e., the UCSD-Ped2, CHUCK Avenue, and ShanghaiTech. This demonstrates the model's superior performance in detecting anomalies compared to state-of-the-art methods, showcasing its potential to significantly enhance automated video surveillance systems by achieving area under the receiver operating characteristic curve (AUC ROC) values of 95.6, 86.8, and 82.1. To show the effectiveness of the proposed framework in detecting anomalies in extra-large datasets, we trained the model on a subset of the huge contemporary CHAD dataset that contains over 1 million frames, achieving AUC ROC values of 71.8 and 64.2 for CHAD-Cam 1 and CHAD-Cam 2, respectively, which outperforms the state-of-the-art techniques.
引用
收藏
页数:31
相关论文
共 50 条
  • [41] MTAD: Multiobjective Transformer Network for Unsupervised Multisensor Anomaly Detection
    Belay, Mohammed Ayalew
    Rasheed, Adil
    Rossi, Pierluigi Salvo
    IEEE SENSORS JOURNAL, 2024, 24 (12) : 20254 - 20265
  • [42] Cluster Attention Contrast for Video Anomaly Detection
    Wang, Ziming
    Zou, Yuexian
    Zhang, Zeming
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2463 - 2471
  • [43] Video Anomaly Detection Based on Attention Mechanism
    Zhang, Qianqian
    Wei, Hongyang
    Chen, Jiaying
    Du, Xusheng
    Yu, Jiong
    SYMMETRY-BASEL, 2023, 15 (02):
  • [44] Video anomaly detection using deep residual-spatiotemporal translation network
    Ganokratanaa, Thittaporn
    Aramvith, Supavadee
    Sebe, Nicu
    PATTERN RECOGNITION LETTERS, 2022, 155 : 143 - 150
  • [45] SwinAnomaly: Real-Time Video Anomaly Detection Using Video Swin Transformer and SORT
    Bajgoti, Arpit
    Gupta, Rishik
    Balaji, Prasanalakshmi
    Dwivedi, Rinky
    Siwach, Meena
    Gupta, Deepak
    IEEE ACCESS, 2023, 11 : 111093 - 111105
  • [46] On Novel System for Detection Video Impairments Using Unsupervised Machine Learning Anomaly Detection Technique
    Goran, Nermin
    Begovic, Alen
    Colakovic, Alem
    TEM JOURNAL-TECHNOLOGY EDUCATION MANAGEMENT INFORMATICS, 2023, 12 (04): : 1995 - 2005
  • [47] Predicting skeleton trajectories using a Skeleton-Transformer for video anomaly detection
    Wenfeng Pang
    Qianhua He
    Yanxiong Li
    Multimedia Systems, 2022, 28 : 1481 - 1494
  • [48] Predicting skeleton trajectories using a Skeleton-Transformer for video anomaly detection
    Pang, Wenfeng
    He, Qianhua
    Li, Yanxiong
    MULTIMEDIA SYSTEMS, 2022, 28 (04) : 1481 - 1494
  • [49] ENAD: An Ensemble Framework for Unsupervised Network Anomaly Detection
    Liao, Jingyi
    Teo, Sin G.
    Kundu, Partha Pratim
    Tram Truong-Huu
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 81 - 88
  • [50] ADSAD: An unsupervised attention-based discrete sequence anomaly detection framework for network security analysis
    Qin, Zhi-Quan
    Ma, Xing-Kong
    Wang, Yong-Jun
    COMPUTERS & SECURITY, 2020, 99