ViolenceNet: Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM for Detecting Violence

被引:34
|
作者
Rendon-Segador, Fernando J. [1 ]
Alvarez-Garcia, Juan A. [1 ]
Enriquez, Fernando [1 ]
Deniz, Oscar [2 ]
机构
[1] Univ Seville, Dept Lenguajes Sist Informat, Seville 41012, Spain
[2] Univ Castilla La Mancha, VISILAB ETSII, Ciudad Real 13071, Spain
关键词
violence detection; fight detection; deep learning; dense net; bidirectional ConvLSTM; VIDEO; SURVEILLANCE; RECOGNITION; FRAMEWORK;
D O I
10.3390/electronics10131601
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introducing efficient automatic violence detection in video surveillance or audiovisual content monitoring systems would greatly facilitate the work of closed-circuit television (CCTV) operators, rating agencies or those in charge of monitoring social network content. In this paper we present a new deep learning architecture, using an adapted version of DenseNet for three dimensions, a multi-head self-attention layer and a bidirectional convolutional long short-term memory (LSTM) module, that allows encoding relevant spatio-temporal features, to determine whether a video is violent or not. Furthermore, an ablation study of the input frames, comparing dense optical flow and adjacent frames subtraction and the influence of the attention layer is carried out, showing that the combination of optical flow and the attention mechanism improves results up to 4.4%. The conducted experiments using four of the most widely used datasets for this problem, matching or exceeding in some cases the results of the state of the art, reducing the number of network parameters needed (4.5 millions), and increasing its efficiency in test accuracy (from 95.6% on the most complex dataset to 100% on the simplest one) and inference time (less than 0.3 s for the longest clips). Finally, to check if the generated model is able to generalize violence, a cross-dataset analysis is performed, which shows the complexity of this approach: using three datasets to train and testing on the remaining one the accuracy drops in the worst case to 70.08% and in the best case to 81.51%, which points to future work oriented towards anomaly detection in new datasets.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Enlivening Redundant Heads in Multi-head Self-attention for Machine Translation
    Zhang, Tianfu
    Huang, Heyan
    Feng, Chong
    Cao, Longbing
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 3238 - 3248
  • [32] LSTM-MH-SA landslide displacement prediction model based on multi-head self-attention mechanism
    Zhang, Zhen-kung
    Zhang, Dong-mei
    Li, Jiang
    Wu, Yi-ping
    ROCK AND SOIL MECHANICS, 2022, 43 : 477 - +
  • [33] Lane Detection Method Based on Improved Multi-Head Self-Attention
    Ge, Zekun
    Tao, Fazhan
    Fu, Zhumu
    Song, Shuzhong
    Computer Engineering and Applications, 60 (02): : 264 - 271
  • [34] MSIN: An Efficient Multi-head Self-attention Framework for Inertial Navigation
    Shi, Gaotao
    Pan, Bingjia
    Ni, Yuzhi
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT I, 2024, 14487 : 455 - 473
  • [35] Local Multi-Head Channel Self-Attention for Facial Expression Recognition
    Pecoraro, Roberto
    Basile, Valerio
    Bono, Viviana
    INFORMATION, 2022, 13 (09)
  • [36] SQL Injection Detection Based on Lightweight Multi-Head Self-Attention
    Lo, Rui-Teng
    Hwang, Wen-Jyi
    Tai, Tsung-Ming
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [37] MHSAN: Multi-Head Self-Attention Network for Visual Semantic Embedding
    Park, Geondo
    Han, Chihye
    Kim, Daeshik
    Yoon, Wonjun
    2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1507 - 1515
  • [38] Speech enhancement method based on the multi-head self-attention mechanism
    Chang X.
    Zhang Y.
    Yang L.
    Kou J.
    Wang X.
    Xu D.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2020, 47 (01): : 104 - 110
  • [39] Hunt for Unseen Intrusion: Multi-Head Self-Attention Neural Detector
    Seo, Seongyun
    Han, Sungmin
    Park, Janghyeon
    Shim, Shinwoo
    Ryu, Han-Eul
    Cho, Byoungmo
    Lee, Sangkyun
    IEEE ACCESS, 2021, 9 : 129635 - 129647
  • [40] MBGAN: An improved generative adversarial network with multi-head self-attention and bidirectional RNN for time series imputation
    Ni, Qingjian
    Cao, Xuehan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 115