SWTA: Sparse Weighted Temporal Attention for Drone-Based Activity Recognition

被引:0
|
作者
Yadav, Santosh Kumar [1 ]
Pahwa, Esha [2 ]
Luthra, Achleshwar [2 ]
Tiwari, Kamlesh [2 ]
Pandey, Hari Mohan [3 ]
机构
[1] Natl Univ Ireland, Coll Sci & Engn, Galway H91 TK33, Ireland
[2] Birla Inst Technol & Sci, Dept CSIS, Pilani 333031, Rajasthan, India
[3] Bournemouth Univ, Comp & Informat, Poole BH12 5BB, Dorset, England
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
Human Activity Recognition; Video Understanding; Drone Action Recognition; CLASSIFICATION;
D O I
10.1109/IJCNN54540.2023.10191750
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Drone-camera based human activity recognition (HAR) has received significant attention from the computer vision research community in the past few years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Attention (SWTA) module to utilize sparsely sampled video frames for obtaining global weighted temporal attention. The proposed SWTA is divided into two components. First, temporal segment network that sparsely samples a given set of frames. Second, weighted temporal attention, which incorporates a fusion of attention maps derived from optical flow, with raw RGB images. This is followed by a basenet network, which comprises a convolutional neural network (CNN) module along with fully connected layers that provide us with activity recognition. The SWTA network can be used as a plug-in module to the existing deep CNN architectures, for optimizing them to learn temporal information by eliminating the need for a separate temporal stream. It has been evaluated on three publicly available benchmark datasets, namely Okutama, MOD20, and Drone-Action. The proposed model has received an accuracy of 72.76%, 92.56%, and 78.86% on the respective datasets thereby surpassing the previous state-of-the-art performances by a margin of 25.26%, 18.56%, and 2.94%, respectively.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] DroneAttention: Sparse weighted temporal attention for drone-camera based activity recognition
    Yadav, Santosh Kumar
    Luthra, Achleshwar
    Pahwa, Esha
    Tiwari, Kamlesh
    Rathore, Heena
    Pandey, Hari Mohan
    Corcoran, Peter
    NEURAL NETWORKS, 2023, 159 : 57 - 69
  • [2] WTM: Weighted Temporal Attention Module for Group Activity Recognition
    Yadav, Santosh Kumar
    Agrawal, Palaash
    Tiwari, Kamlesh
    Adeli, Ehsan
    Pandey, Hari Mohan
    Akbar, Shaik Ali
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [3] DroneSURF: Benchmark Dataset for Drone-based Face Recognition
    Kalra, Isha
    Singh, Maneet
    Nagpal, Shruti
    Singh, Richa
    Vatsa, Mayank
    Sujit, P. B.
    2019 14TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2019), 2019, : 207 - 213
  • [4] Recurrent Temporal Sparse Autoencoder for Attention-based Action Recognition
    Xin, Miao
    Zhang, Hong
    Sun, Mingui
    Yuan, Ding
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 456 - 463
  • [5] Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition
    Zang, Jinliang
    Wang, Le
    Liu, Ziyi
    Zhang, Qilin
    Niu, Zhenxing
    Hua, Gang
    Zheng, Nanning
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018, 2018, 519 : 97 - 108
  • [6] Activity Recognition Based on Spatial-Temporal Attention LSTM
    Xie, Zhao
    Zhou, Yi
    Wu, Ke-Wei
    Zhang, Shun-Ran
    Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 261 - 274
  • [7] Enhancing Security in Distributed Drone-Based Litchi Fruit Recognition and Localization Systems
    Mao, Liang
    Li, Yue
    Wang, Linlin
    Li, Jie
    Tan, Jiajun
    Meng, Yang
    Xiong, Cheng
    CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (02): : 1985 - 1999
  • [8] Collaborative Human Recognition With Lightweight Models in Drone-Based Search and Rescue Operations
    Xu, Lijuan
    Yang, Qinghai
    Qin, Meng
    Wu, Weihua
    Kwak, KyungSup
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (02) : 1765 - 1776
  • [9] Spatio-Temporal Transferability of Drone-Based Models to Predict Forage Supply in Drier Rangelands
    Amputu, Vistorina
    Maenner, Florian
    Tielboerger, Katja
    Knox, Nichola
    REMOTE SENSING, 2024, 16 (11)
  • [10] A Novel Spatial and Temporal Context-Aware Approach for Drone-Based Video Object Detection
    Pi, Zhaoliang
    Lian, Yanchao
    Chen, Xier
    Wu, Yinan
    Li, Yingping
    Jiao, Licheng
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 179 - 188