Parallel Self-Attention and Spatial-Attention Fusion for Human Pose Estimation and Running Movement Recognition

被引:0
|
作者
Wu, Qingtian [1 ]
Zhang, Yu [2 ,3 ]
Zhang, Liming [1 ]
Yu, Haoyong [4 ]
机构
[1] Univ Macau, Fac Sci & Technol, Dept Comp & Informat Sci, Macau, Peoples R China
[2] Univ Macau, Fac Sci & Technol, Macau, Peoples R China
[3] Shenyang Univ Chem Technol, Comp Sci & Technol Coll, Shenyang 110142, Peoples R China
[4] Natl Univ Singapore, Dept Biomed Engn, Singapore 119077, Singapore
关键词
Transformers; Semantics; Pose estimation; Feature extraction; Convolutional neural networks; Task analysis; Visualization; Feature fusion; human pose estimation (HPE); running recognition; self-attention; spatial attention;
D O I
10.1109/TCDS.2023.3275652
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human pose estimation (HPE) is a fundamental yet promising visual recognition problem. Existing popular methods (e.g., Hourglass and its variants) either attempt to directly add local features element-wisely, or (e.g., vision transformers) try to learn the global relationships among different human parts. However, it remains an open problem to effectively integrate the local-global representations for accurate HPE. In this work, we design four feature fusion strategies on the hierarchical ResNet structure, including direct channel concatenation, element-wise addition, and two parallel structures. Both two parallel structures adopt the naive self-attention encoder to model global dependencies. The difference between them is that one adopts the original ResNet BottleNeck while the other employs a spatial-attention module (named SSF) to learn the local patterns. Experiments on COCO Keypoint 2017 show that our SSF for HPE (named SSPose) achieves the best average precision with acceptable computational cost among the compared state-of-the-art methods. In addition, we build a lightweight running data set to verify the effectiveness of SSPose. Based solely on the keypoints estimated by our SSPose, we propose a regression model to identify valid running movements without training any other classifiers. Our source codes and running data set are publicly available.
引用
收藏
页码:358 / 368
页数:11
相关论文
共 50 条
  • [1] Efficient Spatial-Attention Module for Human Pose Estimation
    Tran, Tien-Dat
    Vo, Xuan-Thuy
    Nguyen, Duy-Linh
    Jo, Kang-Hyun
    [J]. FRONTIERS OF COMPUTER VISION, IW-FCV 2021, 2021, 1405 : 242 - 250
  • [2] Self-Attention Network for Human Pose Estimation
    Xia, Hailun
    Zhang, Tianyang
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (04): : 1 - 15
  • [3] Lightweight human pose estimation algorithm based on polarized self-attention
    Liu, Shengjie
    He, Ning
    Wang, Cheng
    Yu, Haigang
    Han, Wenjing
    [J]. MULTIMEDIA SYSTEMS, 2023, 29 (01) : 197 - 210
  • [4] Lightweight human pose estimation algorithm based on polarized self-attention
    Shengjie Liu
    Ning He
    Cheng Wang
    Haigang Yu
    Wenjing Han
    [J]. Multimedia Systems, 2023, 29 : 197 - 210
  • [5] IMPROVING HUMAN POSE ESTIMATION WITH SELF-ATTENTION GENERATIVE ADVERSARIAL NETWORKS
    Cao, Zhongzheng
    Wang, Rui
    Wang, Xiangyang
    Liu, Zhi
    Zhu, Xiaoqiang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2019, : 567 - 572
  • [6] Improving Human Pose Estimation With Self-Attention Generative Adversarial Networks
    Wang, Xiangyang
    Cao, Zhongzheng
    Wang, Rui
    Liu, Zhi
    Zhu, Xiaoqiang
    [J]. IEEE ACCESS, 2019, 7 : 119668 - 119680
  • [7] Combining self-attention and depth-wise convolution for human pose estimation
    Zhang, Fan
    Shi, Qingxuan
    Ma, Yanli
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 5647 - 5661
  • [8] Stacked Hourglass Networks Based on Polarized Self-Attention for Human Pose Estimation
    Luo, Xiaoxia
    Li, Feibiao
    [J]. SECOND IYSF ACADEMIC SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND COMPUTER ENGINEERING, 2021, 12079
  • [9] Satellite pose estimation method based on space carving and self-attention
    Liu Jing-he
    Lin Bao-jun
    [J]. CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2023, 38 (12) : 1736 - 1744
  • [10] Self-attention fusion for audiovisual emotion recognition with incomplete data
    Chumachenko, Kateryna
    Iosifidis, Alexandros
    Gabbouj, Moncef
    [J]. 2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 2822 - 2828