Multi-GAT: A Graphical Attention-Based Hierarchical Multimodal Representation Learning Approach for Human Activity Recognition

被引:52
|
作者
Islam, Md Mofijul [1 ]
Iqbal, Tariq [1 ]
机构
[1] Univ Virginia, Sch Engn & Appl Sci, Charlottesville, VA 22903 USA
关键词
Deep learning for visual perception; gesture; multi; modal perception for HRI; posture and facial expressions; COORDINATION; MOTION;
D O I
10.1109/LRA.2021.3059624
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Recognizing human activities is one of the crucial capabilities that a robot needs to have to be useful around people. Although modern robots are equipped with various types of sensors, human activity recognition (HAR) still remains a challenging problem, particularly in the presence of noisy sensor data. In this work, we introduce a multimodal graphical attention-based HAR approach, called Multi-GAT, which hierarchically learns complementary multimodal features. We develop a multimodal mixture-of-experts model to disentangle and extract salient modality-specific features that enable feature interactions. Additionally, we introduce a novel message-passing based graphical attention approach to capture cross-modal relation for extracting complementary multimodal features. The experimental results on two multimodal human activity datasets suggest that Multi-GAT outperformed state-of-the-art HAR algorithms across all datasets and metrics tested. Finally, the experimental results with noisy sensor data indicate that Multi-GAT consistently outperforms all the evaluated baselines. The robust performance suggests that Multi-GAT can enable seamless human-robot collaboration in noisy human environments.
引用
收藏
页码:1729 / 1736
页数:8
相关论文
共 50 条
  • [11] Soft Spatial Attention-Based Multimodal Driver Action Recognition Using Deep Learning
    Jegham, Imen
    Ben Khalifa, Anouar
    Alouani, Ihsen
    Mahjoub, Mohamed Ali
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (02) : 1918 - 1925
  • [12] Multimodal activity recognition with local block CNN and attention-based spatial weighted CNN
    Zhu, Suguo
    Fang, Zhenying
    Wang, Yi
    Yu, Jun
    Du, Junping
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 60 : 38 - 43
  • [13] An Attention-based Collaboration Framework for Multi-View Network Representation Learning
    Qu, Meng
    Tang, Jian
    Shang, Jingbo
    Ren, Xiang
    Zhang, Ming
    Han, Jiawei
    [J]. CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 1767 - 1776
  • [14] A fuzzy convolutional attention-based GRU network for human activity recognition
    Khodabandelou, Ghazaleh
    Moon, Huiseok
    Amirat, Yacine
    Mohammed, Samer
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 118
  • [15] A Novel Attention-Based Convolution Neural Network for Human Activity Recognition
    Zheng, Ge
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (23) : 27015 - 27025
  • [16] Attention-based LSTM with Multi-task Learning for Distant Speech Recognition
    Zhang, Yu
    Zhang, Pengyuan
    Yan, Yonghong
    [J]. 18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3857 - 3861
  • [17] Uncovering Human Multimodal Activity Recognition with a Deep Learning Approach
    Ranieri, Caetano M.
    Vargas, Patricia A.
    Romero, Roseli A. F.
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [18] An attention-based deep learning approach for inertial motion recognition and estimation in human-robot collaboration
    Zhou, Huiying
    Yang, Geng
    Wang, Baicun
    Li, Xingyu
    Wang, Ruohan
    Huang, Xiaoyan
    Wu, Haiteng
    Wang, Xi Vincent
    [J]. JOURNAL OF MANUFACTURING SYSTEMS, 2023, 67 : 97 - 110
  • [19] Multi-Layer Attention-based State Representation for the Reinforcement Learning of Visual Servoing
    Kitajima, Hiromu
    Bounyong, Souksakhone
    Yoshioka, Mototaka
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE, 2023,
  • [20] AttnSense: Multi-level Attention Mechanism For Multimodal Human Activity Recognition
    Ma, Haojie
    Li, Wenzhong
    Zhang, Xiao
    Gao, Songcheng
    Lu, Sanglu
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 3109 - 3115