Transformer-based deep reverse attention network for multi-sensory human activity recognition

被引:8
|
作者
Pramanik, Rishav [1 ]
Sikdar, Ritodeep [1 ]
Sarkar, Ram [1 ]
机构
[1] Jadavpur Univ, Dept Comp Sci & Engn, Kolkata 700032, West Bengal, India
关键词
Deep learning; Reverse attention; Human activity recognition; Time-series prediction; Sensor data; ENSEMBLE;
D O I
10.1016/j.engappai.2023.106150
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In today's era, one of the important applications of Artificial Intelligence (AI) is Human Activity Recognition (HAR). It has a wide range of applicability in health monitoring for patients with chronic diseases, gaming consoles for gesture recognition, etc. Sensor-based HAR systems use signals collected over a period of time to label an activity. When we design an efficient sensor-based HAR system, a model requires learning an optimal association of spatial and temporal features. In this article, we propose a sensor-based HAR technique using the deep learning approach. We present a deep reverse transformer-based attention mechanism to guide the side residual features Unlike the conventional bottom-up approaches for feature fusion, we exploit a top-down feature fusion approach. The reverse attention is self-calibrated throughout the course of learning, which regularizes the attention modules and dynamically adjusts the learning rate. The overall framework outperforms several state-of-the-art methods and is shown to be statistically significant against these methods on five publicly available sensor-based HAR datasets, namely, MHEALTH, USC-HAD, WHARF, UTD-MHAD1, and UTD-MHAD2. Further, we conduct an ablation study to showcase the importance of each of the components of the proposed framework. Source code of this work is available at https://github.com/rishavpramanik/ RevTransformerAttentionHAR.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A transformer-based deep neural network model for SSVEP classification
    Chen, Jianbo
    Zhang, Yangsong
    Pan, Yudong
    Xu, Peng
    Guan, Cuntai
    NEURAL NETWORKS, 2023, 164 : 521 - 534
  • [32] VT-BPAN: vision transformer-based bilinear pooling and attention network fusion of RGB and skeleton features for human action recognition
    Sun Y.
    Xu W.
    Yu X.
    Gao J.
    Multimedia Tools and Applications, 2024, 83 (29) : 73391 - 73405
  • [33] Transformer-based Scene Graph Generation Network With Relational Attention Module
    Yamamoto, Takuma
    Obinata, Yuya
    Nakayama, Osafumi
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2034 - 2041
  • [34] Texture recognition based on multi-sensory integration of proprioceptive and tactile signals
    Rostamian, Behnam
    Koolani, MohammadReza
    Abdollahzade, Pouya
    Lankarany, Milad
    Falotico, Egidio
    Amiri, Mahmood
    Thakor, Nitish V.
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [35] A weakly-supervised transformer-based hybrid network with multi-attention for pavement crack detection
    Wang, Zhenlin
    Leng, Zhufei
    Zhang, Zhixin
    CONSTRUCTION AND BUILDING MATERIALS, 2024, 411
  • [36] TMA-Net: A Transformer-Based Multi-Scale Attention Network for Surgical Instrument Segmentation
    Yang, Lei
    Wang, Hongyong
    Gu, Yuge
    Bian, Guibin
    Liu, Yanhong
    Yu, Hongnian
    IEEE TRANSACTIONS ON MEDICAL ROBOTICS AND BIONICS, 2023, 5 (02): : 323 - 334
  • [37] Dual-attention transformer-based hybrid network for multi-modal medical image segmentation
    Zhang, Menghui
    Zhang, Yuchen
    Liu, Shuaibing
    Han, Yahui
    Cao, Honggang
    Qiao, Bingbing
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [38] Efficient human activity recognition: A deep convolutional transformer-based contrastive self-supervised approach using wearable sensors
    Sun, Yujie
    Xu, Xiaolong
    Tian, Xincheng
    Zhou, Lelai
    Li, Yibin
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 135
  • [39] Transformer-based Human Action Recognition with Dynamic Feature Selection
    Lamghari, Soufiane
    Bilodeau, Guillaume-Alexandre
    Saunier, Nicolas
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 129 - 136
  • [40] End to end transformer-based contextual speech recognition based on pointer network
    Lin, Binghuai
    Wang, Liyuan
    INTERSPEECH 2021, 2021, : 2087 - 2091