Predicting skeleton trajectories using a Skeleton-Transformer for video anomaly detection

被引：0

作者：

Wenfeng Pang

Qianhua He

Yanxiong Li

机构：

[1] South China University of Technology,School of Electronic and Information Engineering

来源：

Multimedia Systems | 2022年 / 28卷

关键词：

Anomaly detection; Skeleton trajectory prediction; Skeleton-Transformer; Multi-head self-attention; Temporal convolutional layer;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Video anomaly detection detects video contents that do not conform to normal patterns offered by the training set. Because appearance-based features are susceptible to background interference, unlike most papers applying appearance-based methods, this paper proposes a novel Skeleton-Transformer (SkT) to predict future pose components in video frames and take errors between predicted pose components and corresponding expected values as anomaly scores. In SkT, we apply the multi-head self-attention (MSA) module and temporal convolutional layer (TCL), which are complementary because they focus on processing information from different viewpoints, to compose a skeleton attention (SkA) block. The MSA module can capture long-range dependencies between arbitrary pairwise pose components on spatial and temporal dimensions from different perspectives, while the TCL concentrates on local temporal information. Finally, multiple SkA blocks are stacked to form the major constituent of the SkT. To the best of our knowledge, the proposed approach is the first work applying Transformer framework to anomaly detection based on pose components, and we conduct experiments to determine the optimal structure. The proposed method achieves a frame-level AUC of 77.65% on the HR-ShanghaiTech dataset, exceeding state-of-the-art methods. Moreover, ablation studies validate each module’s effectiveness in the SkT, further verifying that the Transformer-based method is promising for anomaly detection.

引用

页码：1481 / 1494

页数：13

共 50 条

[41] An analysis on human fall detection using skeleton from Microsoft Kinect
Thi-Thanh-Hai Tran
Thi-Lan Le
Morel, Jeremy
2014 IEEE FIFTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE), 2014, : 484 - 489
[42] Twin-tower transformer network for skeleton-based Parkinson's disease early detection
Ma, Lan
Huo, Hua
Liu, Wei
Zhao, Changwei
Wang, Jinxuan
Xu, Ningya
COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (05) : 6745 - 6765
[43] Three-Dimensional Diffusion Model in Sports Dance Video Human Skeleton Detection and Extraction
Li, Zhi
ADVANCES IN MATHEMATICAL PHYSICS, 2021, 2021
[44] On-line video multi-object segmentation based on skeleton model and occlusion detection
Guoheng Huang
Chi-Man Pun
Multimedia Tools and Applications, 2018, 77 : 31313 - 31329
[45] On-line video multi-object segmentation based on skeleton model and occlusion detection
Huang, Guoheng
Pun, Chi-Man
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (23) : 31313 - 31329
[46] Anomaly Detection and Activity Perception Using Covariance Descriptor for Trajectories
Ergezer, Hamza
Leblebicioglu, Kemal
COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 728 - 742
[47] Explainable Anomaly Detection Using Vision Transformer Based SVDD
Baek, Ji-Won
Chung, Kyungyong
CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 6573 - 6586
[48] Anomaly detection in surveillance videos using Transformer with margin learning
Wang, Dicong
Wu, Kaijun
MULTIMEDIA SYSTEMS, 2024, 30 (05)
[49] Unsupervised video anomaly detection using feature clustering
Li, H.
Achim, A.
Bull, D.
IET SIGNAL PROCESSING, 2012, 6 (05) : 521 - 533
[50] Anomaly Detection in Surveillance Video Using Pose Estimation
Thyagarajmurthy, A.
Ninad, M. G.
Rakesh, B. G.
Niranjan, S.
Manvi, Bharat
EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY, ICERECT 2018, 2019, 545 : 753 - 766

← 1 2 3 4 5 →