Predicting skeleton trajectories using a Skeleton-Transformer for video anomaly detection

被引:0
|
作者
Wenfeng Pang
Qianhua He
Yanxiong Li
机构
[1] South China University of Technology,School of Electronic and Information Engineering
来源
Multimedia Systems | 2022年 / 28卷
关键词
Anomaly detection; Skeleton trajectory prediction; Skeleton-Transformer; Multi-head self-attention; Temporal convolutional layer;
D O I
暂无
中图分类号
学科分类号
摘要
Video anomaly detection detects video contents that do not conform to normal patterns offered by the training set. Because appearance-based features are susceptible to background interference, unlike most papers applying appearance-based methods, this paper proposes a novel Skeleton-Transformer (SkT) to predict future pose components in video frames and take errors between predicted pose components and corresponding expected values as anomaly scores. In SkT, we apply the multi-head self-attention (MSA) module and temporal convolutional layer (TCL), which are complementary because they focus on processing information from different viewpoints, to compose a skeleton attention (SkA) block. The MSA module can capture long-range dependencies between arbitrary pairwise pose components on spatial and temporal dimensions from different perspectives, while the TCL concentrates on local temporal information. Finally, multiple SkA blocks are stacked to form the major constituent of the SkT. To the best of our knowledge, the proposed approach is the first work applying Transformer framework to anomaly detection based on pose components, and we conduct experiments to determine the optimal structure. The proposed method achieves a frame-level AUC of 77.65% on the HR-ShanghaiTech dataset, exceeding state-of-the-art methods. Moreover, ablation studies validate each module’s effectiveness in the SkT, further verifying that the Transformer-based method is promising for anomaly detection.
引用
收藏
页码:1481 / 1494
页数:13
相关论文
共 50 条
  • [41] An analysis on human fall detection using skeleton from Microsoft Kinect
    Thi-Thanh-Hai Tran
    Thi-Lan Le
    Morel, Jeremy
    2014 IEEE FIFTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE), 2014, : 484 - 489
  • [42] Twin-tower transformer network for skeleton-based Parkinson's disease early detection
    Ma, Lan
    Huo, Hua
    Liu, Wei
    Zhao, Changwei
    Wang, Jinxuan
    Xu, Ningya
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 6745 - 6765
  • [43] Three-Dimensional Diffusion Model in Sports Dance Video Human Skeleton Detection and Extraction
    Li, Zhi
    ADVANCES IN MATHEMATICAL PHYSICS, 2021, 2021
  • [44] On-line video multi-object segmentation based on skeleton model and occlusion detection
    Guoheng Huang
    Chi-Man Pun
    Multimedia Tools and Applications, 2018, 77 : 31313 - 31329
  • [45] On-line video multi-object segmentation based on skeleton model and occlusion detection
    Huang, Guoheng
    Pun, Chi-Man
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (23) : 31313 - 31329
  • [46] Anomaly Detection and Activity Perception Using Covariance Descriptor for Trajectories
    Ergezer, Hamza
    Leblebicioglu, Kemal
    COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 728 - 742
  • [47] Explainable Anomaly Detection Using Vision Transformer Based SVDD
    Baek, Ji-Won
    Chung, Kyungyong
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 6573 - 6586
  • [48] Anomaly detection in surveillance videos using Transformer with margin learning
    Wang, Dicong
    Wu, Kaijun
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [49] Unsupervised video anomaly detection using feature clustering
    Li, H.
    Achim, A.
    Bull, D.
    IET SIGNAL PROCESSING, 2012, 6 (05) : 521 - 533
  • [50] Anomaly Detection in Surveillance Video Using Pose Estimation
    Thyagarajmurthy, A.
    Ninad, M. G.
    Rakesh, B. G.
    Niranjan, S.
    Manvi, Bharat
    EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 753 - 766