Multi-GAT: A Graphical Attention-Based Hierarchical Multimodal Representation Learning Approach for Human Activity Recognition

被引:52
|
作者
Islam, Md Mofijul [1 ]
Iqbal, Tariq [1 ]
机构
[1] Univ Virginia, Sch Engn & Appl Sci, Charlottesville, VA 22903 USA
关键词
Deep learning for visual perception; gesture; multi; modal perception for HRI; posture and facial expressions; COORDINATION; MOTION;
D O I
10.1109/LRA.2021.3059624
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recognizing human activities is one of the crucial capabilities that a robot needs to have to be useful around people. Although modern robots are equipped with various types of sensors, human activity recognition (HAR) still remains a challenging problem, particularly in the presence of noisy sensor data. In this work, we introduce a multimodal graphical attention-based HAR approach, called Multi-GAT, which hierarchically learns complementary multimodal features. We develop a multimodal mixture-of-experts model to disentangle and extract salient modality-specific features that enable feature interactions. Additionally, we introduce a novel message-passing based graphical attention approach to capture cross-modal relation for extracting complementary multimodal features. The experimental results on two multimodal human activity datasets suggest that Multi-GAT outperformed state-of-the-art HAR algorithms across all datasets and metrics tested. Finally, the experimental results with noisy sensor data indicate that Multi-GAT consistently outperforms all the evaluated baselines. The robust performance suggests that Multi-GAT can enable seamless human-robot collaboration in noisy human environments.
引用
收藏
页码:1729 / 1736
页数:8
相关论文
共 50 条
  • [31] An Information Gain-Based Model and an Attention-Based RNN for Wearable Human Activity Recognition
    Liu, Leyuan
    He, Jian
    Ren, Keyan
    Lungu, Jonathan
    Hou, Yibin
    Dong, Ruihai
    [J]. ENTROPY, 2021, 23 (12)
  • [32] ABML: attention-based multi-task learning for jointly humor recognition and pun detection
    Ren, Lu
    Xu, Bo
    Lin, Hongfei
    Yang, Liang
    [J]. SOFT COMPUTING, 2021, 25 (22) : 14109 - 14118
  • [33] ABML: attention-based multi-task learning for jointly humor recognition and pun detection
    Lu Ren
    Bo Xu
    Hongfei Lin
    Liang Yang
    [J]. Soft Computing, 2021, 25 : 14109 - 14118
  • [34] Attention-Based Deep Learning System for Classification of Breast Lesions-Multimodal, Weakly Supervised Approach
    Bobowicz, Maciej
    Rygusik, Marlena
    Buler, Jakub
    Buler, Rafal
    Ferlin, Maria
    Kwasigroch, Arkadiusz
    Szurowska, Edyta
    Grochowski, Michal
    [J]. CANCERS, 2023, 15 (10)
  • [35] A two-level attention-based interaction model for multi-person activity recognition
    Lu, Lihua
    Di, Huijun
    Lu, Yao
    Zhang, Lin
    Wang, Shunzhou
    [J]. NEUROCOMPUTING, 2018, 322 : 195 - 205
  • [36] Orthogonal channel attention-based multi-task learning for multi-view facial expression recognition
    Chen, Jingying
    Yang, Lei
    Tan, Lei
    Xu, Ruyi
    [J]. PATTERN RECOGNITION, 2022, 129
  • [37] Lexicon and attention-based named entity recognition for kiwifruit diseases and pests: A Deep learning approach
    Zhang, Lilin
    Nie, Xiaolin
    Zhang, Mingmei
    Gu, Mingyang
    Geissen, Violette
    Ritsema, Coen J.
    Niu, Dangdang
    Zhang, Hongming
    [J]. FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [38] Attention-Based Multi-Modal Multi-View Fusion Approach for Driver Facial Expression Recognition
    Chen, Jianrong
    Dey, Sujit
    Wang, Lei
    Bi, Ning
    Liu, Peng
    [J]. IEEE Access, 2024, 12 : 137203 - 137221
  • [39] Hierarchical Multi-Agent Deep Reinforcement Learning with an Attention-based Graph Matching Approach for Multi-Domain VNF-FG Embedding
    Slim, Lotfi
    Bannour, Fetia
    [J]. IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 2105 - 2110
  • [40] A multimodal approach for human activity recognition based on skeleton and RGB data
    Franco, Annalisa
    Magnani, Antonio
    Maio, Dario
    [J]. PATTERN RECOGNITION LETTERS, 2020, 131 (293-299) : 293 - 299