Enhancing Video Anomaly Detection Using a Transformer Spatiotemporal Attention Unsupervised Framework for Large Datasets

被引:2
|
作者
Habeb, Mohamed H. [1 ]
Salama, May [1 ]
Elrefaei, Lamiaa A. [1 ]
机构
[1] Benha Univ, Fac Engn Shoubra, Elect Engn Dept, Cairo 11629, Egypt
关键词
video anomaly detection; unsupervised learning; spatiotemporal modeling; large datasets; LOCALIZATION; RECOGNITION; HISTOGRAMS; EXTRACTION;
D O I
10.3390/a17070286
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work introduces an unsupervised framework for video anomaly detection, leveraging a hybrid deep learning model that combines a vision transformer (ViT) with a convolutional spatiotemporal relationship (STR) attention block. The proposed model addresses the challenges of anomaly detection in video surveillance by capturing both local and global relationships within video frames, a task that traditional convolutional neural networks (CNNs) often struggle with due to their localized field of view. We have utilized a pre-trained ViT as an encoder for feature extraction, which is then processed by the STR attention block to enhance the detection of spatiotemporal relationships among objects in videos. The novelty of this work is utilizing the ViT with the STR attention to detect video anomalies effectively in large and heterogeneous datasets, an important thing given the diverse environments and scenarios encountered in real-world surveillance. The framework was evaluated on three benchmark datasets, i.e., the UCSD-Ped2, CHUCK Avenue, and ShanghaiTech. This demonstrates the model's superior performance in detecting anomalies compared to state-of-the-art methods, showcasing its potential to significantly enhance automated video surveillance systems by achieving area under the receiver operating characteristic curve (AUC ROC) values of 95.6, 86.8, and 82.1. To show the effectiveness of the proposed framework in detecting anomalies in extra-large datasets, we trained the model on a subset of the huge contemporary CHAD dataset that contains over 1 million frames, achieving AUC ROC values of 71.8 and 64.2 for CHAD-Cam 1 and CHAD-Cam 2, respectively, which outperforms the state-of-the-art techniques.
引用
收藏
页数:31
相关论文
共 50 条
  • [21] Unsupervised anomaly detection in large databases using Bayesian networks
    Cansado, Antonio
    Soto, Alvaro
    APPLIED ARTIFICIAL INTELLIGENCE, 2008, 22 (04) : 309 - 330
  • [22] Contrastive Attention for Video Anomaly Detection
    Chang, Shuning
    Li, Yanchao
    Shen, Shengmei
    Feng, Jiashi
    Zhou, Zhiying
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 4067 - 4076
  • [23] Spatiotemporal Representation Learning for Video Anomaly Detection
    Li, Zhaoyan
    Li, Yaoshun
    Gao, Zhisheng
    IEEE ACCESS, 2020, 8 (08): : 25531 - 25542
  • [24] Deepfake Video Detection with Spatiotemporal Dropout Transformer
    Zhang, Daichi
    Lin, Fanzhao
    Hua, Yingying
    Wang, Pengju
    Zeng, Dan
    Ge, Shiming
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5833 - 5841
  • [25] AEMNet: Unsupervised Video Anomaly Detection Method Based on Attention-Enhanced Memory Networks
    Zhang, Linliang
    Yan, Lianshan
    Peng, Shouxin
    Pan, Lihu
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2024, 38 (08)
  • [26] Unsupervised Anomaly Detection in Multivariate Spatio-Temporal Datasets Using Deep Learning
    Karadayi, Yildiz
    ADVANCED ANALYTICS AND LEARNING ON TEMPORAL DATA, AALTD 2019, 2020, 11986 : 167 - 182
  • [27] AutoAD: an Automated Framework for Unsupervised Anomaly Detection
    Putina, Andrian
    Bahri, Maroua
    Salutari, Flavia
    Sozio, Mauro
    2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2022, : 106 - 115
  • [28] Unsupervised Abundance Matrix Reconstruction Transformer-Guided Fractional Attention Mechanism for Hyperspectral Anomaly Detection
    Young, Si-Sheng
    Lin, Chia-Hsiang
    Leng, Zi-Chao
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [29] CVTGAD: Simplified Transformer with Cross-View Attention for Unsupervised Graph-Level Anomaly Detection
    Li, Jindong
    Xing, Qianli
    Wang, Qi
    Chang, Yi
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT I, 2023, 14169 : 185 - 200
  • [30] An Unsupervised Deep Learning Framework for Anomaly Detection
    Kuo, Che-Wei
    Ying, Josh Jia-Ching
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT I, 2023, 13995 : 284 - 295