ViolenceNet: Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM for Detecting Violence

被引:34
|
作者
Rendon-Segador, Fernando J. [1 ]
Alvarez-Garcia, Juan A. [1 ]
Enriquez, Fernando [1 ]
Deniz, Oscar [2 ]
机构
[1] Univ Seville, Dept Lenguajes Sist Informat, Seville 41012, Spain
[2] Univ Castilla La Mancha, VISILAB ETSII, Ciudad Real 13071, Spain
关键词
violence detection; fight detection; deep learning; dense net; bidirectional ConvLSTM; VIDEO; SURVEILLANCE; RECOGNITION; FRAMEWORK;
D O I
10.3390/electronics10131601
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introducing efficient automatic violence detection in video surveillance or audiovisual content monitoring systems would greatly facilitate the work of closed-circuit television (CCTV) operators, rating agencies or those in charge of monitoring social network content. In this paper we present a new deep learning architecture, using an adapted version of DenseNet for three dimensions, a multi-head self-attention layer and a bidirectional convolutional long short-term memory (LSTM) module, that allows encoding relevant spatio-temporal features, to determine whether a video is violent or not. Furthermore, an ablation study of the input frames, comparing dense optical flow and adjacent frames subtraction and the influence of the attention layer is carried out, showing that the combination of optical flow and the attention mechanism improves results up to 4.4%. The conducted experiments using four of the most widely used datasets for this problem, matching or exceeding in some cases the results of the state of the art, reducing the number of network parameters needed (4.5 millions), and increasing its efficiency in test accuracy (from 95.6% on the most complex dataset to 100% on the simplest one) and inference time (less than 0.3 s for the longest clips). Finally, to check if the generated model is able to generalize violence, a cross-dataset analysis is performed, which shows the complexity of this approach: using three datasets to train and testing on the remaining one the accuracy drops in the worst case to 70.08% and in the best case to 81.51%, which points to future work oriented towards anomaly detection in new datasets.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Multi-Head Attention Based Bidirectional LSTM for Spelling Error Detection in the Indonesian Language
    Yanfi, Yanfi
    Soeparno, Haryono
    Setiawan, Reina
    Budiharto, Widodo
    IEEE ACCESS, 2024, 12 : 188560 - 188571
  • [42] A novel two-stream multi-head self-attention convolutional neural network for bearing fault diagnosis
    Ren, Hang
    Liu, Shaogang
    Wei, Fengmei
    Qiu, Bo
    Zhao, Dan
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2024, 238 (11) : 5393 - 5405
  • [43] Multi-head self-attention based gated graph convolutional networks for aspect-based sentiment classification
    Luwei Xiao
    Xiaohui Hu
    Yinong Chen
    Yun Xue
    Bingliang Chen
    Donghong Gu
    Bixia Tang
    Multimedia Tools and Applications, 2022, 81 : 19051 - 19070
  • [44] Modality attention fusion model with hybrid multi-head self-attention for video understanding
    Zhuang, Xuqiang
    Liu, Fang'al
    Hou, Jian
    Hao, Jianhua
    Cai, Xiaohong
    PLOS ONE, 2022, 17 (10):
  • [45] Multi-head self-attention based gated graph convolutional networks for aspect-based sentiment classification
    Xiao, Luwei
    Hu, Xiaohui
    Chen, Yinong
    Xue, Yun
    Chen, Bingliang
    Gu, Donghong
    Tang, Bixia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (14) : 19051 - 19070
  • [46] CPMA: Spatio-Temporal Network Prediction Model Based on Convolutional Parallel Multi-head Self-attention
    Liu, Tiantian
    You, Xin
    Ma, Ming
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT II, ICIC 2024, 2024, 14876 : 113 - 124
  • [47] EEG-Based Emotion Recognition Using Convolutional Recurrent Neural Network with Multi-Head Self-Attention
    Hu, Zhangfang
    Chen, Libujie
    Luo, Yuan
    Zhou, Jingfan
    APPLIED SCIENCES-BASEL, 2022, 12 (21):
  • [48] A novel intelligent fault diagnosis method of bearing based on multi-head self-attention convolutional neural network
    Ren, Hang
    Liu, Shaogang
    Qiu, Bo
    Guo, Hong
    Zhao, Dan
    AI EDAM-ARTIFICIAL INTELLIGENCE FOR ENGINEERING DESIGN ANALYSIS AND MANUFACTURING, 2024, 38
  • [49] Joint extraction of entities and relations based on character graph convolutional network and Multi-Head Self-Attention Mechanism
    Meng, Zhao
    Tian, Shengwei
    Yu, Long
    Lv, Yalong
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2021, 33 (02) : 349 - 362
  • [50] MSA: Jointly Detecting Drug Name and Adverse Drug Reaction Mentioning Tweets with Multi-Head Self-Attention
    Wu, Chuhan
    Wu, Fangzhao
    Yuan, Zhigang
    Liu, Junxin
    Huang, Yongfeng
    Xie, Xing
    PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, : 33 - 41