Spatio-Temporal Information for Action Recognition in Thermal Video Using Deep Learning Model

被引:0
|
作者
Srihari, P. [1 ]
Harikiran, J. [1 ]
机构
[1] VIT AP Univ, Sch Comp Sci & Engn, Amaravathi 522237, India
关键词
Action Recognition; Activity Classification; Complex-Valued Deep Fully Convolutional Network; Deep Learning; Deeplabv3+Net; Fall Detection; Mask R-CNN; Thermal Cameras; Violence; FEATURE FUSION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Researchers can evaluate numerous information to ensure automated monitoring due to the widespread use of surveillance cameras in smart cities. For the monitoring of violence or abnormal behaviors in smart cities, schools, hospitals, residences, and other observational domains, an enhanced safety and security system is required to prevent any injuries that might result in ecological, economic and social losses. Automatic detection for prompt actions is vital and may help the respective departments effectively. Based on thermal imaging, several researchers have concentrated on object detection, tracking, and action identification. Few studies have simultaneously extracted spatial-temporal information from a thermal image and utilized it to recognize human actions. This research provides a novelty based on frame-level and spatial and temporal features which combines richer context temporal information to address the issue of poor efficiency and less accuracy in detecting abnormal/violent behavior in thermal monitoring devices. The model can locate (bounded box) video frame areas involving different human activities and recognize (classify) the actions. The dataset on human behavior includes videos captured with infrared cameras in both indoor and outdoor environments. The experimental results using the publicly available benchmark datasets reveal the proposed model's efficiency. Our model achieves 98.5% and 94.85% accuracy on IITR Infrared Action Recognition (IITR-IAR) and Thermal Simulated Fall (TSF) datasets, respectively. In addition, the proposed method may be evaluated in more realistic conditions, such as zooming in and out etc.
引用
收藏
页码:669 / 680
页数:12
相关论文
共 50 条
  • [1] Action Recognition by Learning Deep Multi-Granular Spatio-Temporal Video Representation
    Li, Qing
    Qiu, Zhaofan
    Yao, Ting
    Mei, Tao
    Rui, Yong
    Luo, Jiebo
    [J]. ICMR'16: PROCEEDINGS OF THE 2016 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, 2016, : 159 - 166
  • [2] Learning spatio-temporal features for action recognition from the side of the video
    Pei, Lishen
    Ye, Mao
    Zhao, Xuezhuan
    Xiang, Tao
    Li, Tao
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2016, 10 (01) : 199 - 206
  • [3] Learning spatio-temporal features for action recognition from the side of the video
    Lishen Pei
    Mao Ye
    Xuezhuan Zhao
    Tao Xiang
    Tao Li
    [J]. Signal, Image and Video Processing, 2016, 10 : 199 - 206
  • [4] Deep Learning Based Video Spatio-Temporal Modeling for Emotion Recognition
    Fonnegra, Ruben D.
    Diaz, Gloria M.
    [J]. HUMAN-COMPUTER INTERACTION: THEORIES, METHODS, AND HUMAN ISSUES, HCI INTERNATIONAL 2018, PT I, 2018, 10901 : 397 - 408
  • [5] Deep video action clustering via spatio-temporal feature learning
    Peng, Bo
    Lei, Jianjun
    Fu, Huazhu
    Jia, Yalong
    Zhang, Zongqian
    Li, Yi
    [J]. NEUROCOMPUTING, 2021, 456 : 519 - 527
  • [6] Spatio-temporal information for human action recognition
    Yao, Li
    Liu, Yunjian
    Huang, Shihui
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,
  • [7] Spatio-temporal information for human action recognition
    Li Yao
    Yunjian Liu
    Shihui Huang
    [J]. EURASIP Journal on Image and Video Processing, 2016
  • [8] Spatio-temporal Video Autoencoder for Human Action Recognition
    Sousa e Santos, Anderson Carlos
    Pedrini, Helio
    [J]. PROCEEDINGS OF THE 14TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2019, : 114 - 123
  • [9] LEARNING SPATIO-TEMPORAL DEPENDENCIES FOR ACTION RECOGNITION
    Cai, Qiao
    Yin, Yafeng
    Man, Hong
    [J]. 2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 3740 - 3744
  • [10] Interpretable Spatio-temporal Attention for Video Action Recognition
    Meng, Lili
    Zhao, Bo
    Chang, Bo
    Huang, Gao
    Sun, Wei
    Tung, Frederich
    Sigal, Leonid
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1513 - 1522