MSRT: multi-scale representation transformer for regression-based human pose estimation

被引:4
|
作者
Shan, Beiguang [1 ]
Shi, Qingxuan [1 ,2 ,3 ]
Yang, Fang [1 ,2 ,3 ]
机构
[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071000, Peoples R China
[2] Hebei Univ, Hebei Machine Vis Engn Res Ctr, Baoding 071000, Peoples R China
[3] Hebei Univ, Inst Intelligent Image & Document Informat Proc, Baoding 071000, Peoples R China
关键词
Human pose estimation; Multi-scale representation; Transformer; Deep learning;
D O I
10.1007/s10044-023-01130-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we are interested in the human pose estimation problem with a focus on leveraging discriminative pose features. Recent pose estimation works concentrate on extracting high-level features but ignore the low-level details, thus reducing the prediction accuracy. To mitigate the above issues, we propose an end-to-end method called multi-scale representation transformer network (MSRT). Our network consists of two key components: feature aggregation module (FAM) and transformers. The FAM splits and stacks feature maps of different scales, then fuses them to achieve multi-scale representation learning. This module makes up for the lack of detailed information in the high-level features. Furthermore, we utilize Transformers to identify long-range interactions among feature maps, and capture implicit body structure information, which allows the proposed network to refine the locations of terminal and occluded joints. Compared with existing regression-based methods, MSRT achieves superior results on the COCO2017 and MPII datasets.
引用
收藏
页码:591 / 603
页数:13
相关论文
共 50 条
  • [1] MSRT: multi-scale representation transformer for regression-based human pose estimation
    Beiguang Shan
    Qingxuan Shi
    Fang Yang
    [J]. Pattern Analysis and Applications, 2023, 26 : 591 - 603
  • [2] Human Pose Estimation Based on Lightweight Multi-Scale Coordinate Attention
    Li, Xin
    Guo, Yuxin
    Pan, Weiguo
    Liu, Hongzhe
    Xu, Bingxin
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [3] Human pose estimation in complex background videos via Transformer-based multi-scale feature integration
    Cheng, Chen
    Xu, Huahu
    [J]. DISPLAYS, 2024, 84
  • [4] Multi-Scale Collaborative Network for Human Pose Estimation
    Guo, Chunsheng
    Zhou, Jialuo
    Du, Wenlong
    Zhang, Xuguang
    [J]. INTERNATIONAL JOURNAL OF HUMANOID ROBOTICS, 2019, 16 (04)
  • [5] Multi-Scale Contrastive Learning for Human Pose Estimation
    Bao, Wenxia
    Lin, An
    Huang, Hua
    Yang, Xianjun
    Chen, Hemu
    [J]. IEICE Transactions on Information and Systems, 2024, E107.D (10) : 1332 - 1341
  • [6] MULTI-SCALE SUPERVISED NETWORK FOR HUMAN POSE ESTIMATION
    Ke, Lipeng
    Chang, Ming-Ching
    Qi, Honggang
    Lyu, Siwei
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 564 - 568
  • [7] Multi-scale spatial-temporal transformer for 3D human pose estimation
    Wu, Yongpeng
    Gao, Junna
    [J]. 2021 5TH INTERNATIONAL CONFERENCE ON VISION, IMAGE AND SIGNAL PROCESSING (ICVISP 2021), 2021, : 242 - 247
  • [8] Human pose estimation based on feature enhancement and multi-scale feature fusion
    Cao, Dandan
    Liu, Weibin
    Xing, Weiwei
    Wei, Xiang
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (03) : 643 - 650
  • [9] Human Pose Estimation Method Based on Optimized Multi-scale Feature Fusion
    Liu, Hongzhe
    Tao, Xiangru
    Xu, Cheng
    Cao, Dongpu
    [J]. Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2024, 60 (16): : 306 - 313
  • [10] Selective Learning of Human Pose Estimation Based on Multi-Scale Convergence Network
    Liu, Wenkai
    Qin, Cuizhu
    Wu, Menglong
    Bai, Wenle
    Dong, Hongxia
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (05) : 1081 - 1084