InferTrans: Hierarchical structural fusion transformer for crowded human pose estimation

被引:0
|
作者
Li, Muyu [1 ,2 ]
Wang, Yingfeng [4 ]
Hu, Henan [3 ]
Zhao, Xudong [1 ,2 ]
机构
[1] Institute of Intelligent Science and Technology, School of Control Science and Engineering, Dalian University of Technology, Liaoning, Dalian,116024, China
[2] Key Laboratory of Intelligent Control and Optimization for Industrial Equipment of Ministry of Education, Dalian University of Technology, Liaoning, Dalian,116024, China
[3] School of Mechanical Engineering, Dalian Jiaotong University, Liaoning, Dalian,116028, China
[4] Center for Intelligent Multidimensional Data Analysis, Hong Kong Science Park, Hong Kong
关键词
D O I
10.1016/j.inffus.2024.102878
中图分类号
学科分类号
摘要
Human pose estimation in crowded scenes presents unique challenges due to frequent occlusions and complex interactions between individuals. To address these issues, we introduce InferTrans, a hierarchical structural fusion Transformer designed to improve crowded human pose estimation. InferTrans integrates semantic features into structural information using a hierarchical joint-limb-semantic fusion module. By reorganizing joints and limbs into a tree structure, the fusion module facilitates effective information exchange across different structural levels, and leverage both global structural information and local contextual details. Furthermore, we explicitly model limb structural patterns separately from joints, treating limbs as vectors with defined lengths and orientations. This allows our model to infer complete human poses from minimal input, significantly enhancing pose refinement tasks. Extensive experiments on multiple datasets demonstrate that InferTrans outperforms existing pose estimation techniques in crowded and occluded scenarios. The proposed InferTrans serves as a robust post-processing technique, and is capable of improving the accuracy and robustness of pose estimation in challenging environments. © 2024 Elsevier B.V.
引用
下载
收藏
相关论文
共 50 条
  • [31] Dual Graph Networks for Pose Estimation in Crowded Scenes
    Jun Tu
    Gangshan Wu
    Limin Wang
    International Journal of Computer Vision, 2024, 132 (3) : 633 - 653
  • [32] Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference
    Wang, Dongkai
    Zhang, Shiliang
    Hua, Gang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [33] Spatiotemporal Learning Transformer for Video-Based Human Pose Estimation
    Gai, Di
    Feng, Runyang
    Min, Weidong
    Yang, Xiaosong
    Su, Pengxiang
    Wang, Qi
    Han, Qing
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4564 - 4576
  • [34] DPIT: Dual-Pipeline Integrated Transformer for Human Pose Estimation
    Zhao, Shuaitao
    Liu, Kun
    Huang, Yuhang
    Bao, Qian
    Zeng, Dan
    Liu, Wu
    ARTIFICIAL INTELLIGENCE, CICAI 2022, PT II, 2022, 13605 : 559 - 576
  • [35] FusionFormer: A Concise Unified Feature Fusion Transformer for 3D Pose Estimation
    Cai, Yanlu
    Zhang, Weizhong
    Wu, Yuan
    Jin, Cheng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 2, 2024, : 900 - 908
  • [36] LFSimCC: Spatial fusion lightweight network for human pose estimation
    Zheng, Qian
    Guo, Hualing
    Yin, Yunhua
    Zheng, Bin
    Jiang, Hongxu
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 99
  • [37] Discriminative fusion of shape and appearance features for human pose estimation
    Sedai, S.
    Bennamoun, M.
    Huynh, D. Q.
    PATTERN RECOGNITION, 2013, 46 (12) : 3223 - 3237
  • [38] PPT: Token-Pruned Pose Transformer for Monocular and Multi-view Human Pose Estimation
    Ma, Haoyu
    Wang, Zhe
    Chen, Yifei
    Kong, Deying
    Chen, Liangjian
    Liu, Xingwei
    Yan, Xiangyi
    Tang, Hao
    Xie, Xiaohui
    COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 424 - 442
  • [39] ON VISUO-INERTIAL FUSION FOR ROBOT POSE ESTIMATION USING HIERARCHICAL FUZZY SYSTEMS
    Kyriakoulis, Nikolaos
    Gasteratos, Antonios
    INTERNATIONAL JOURNAL OF OPTOMECHATRONICS, 2012, 6 (01) : 17 - 36
  • [40] CrowdPose: Efficient Crowded Scenes Pose Estimation and A New Benchmark
    Li, Jiefeng
    Wang, Can
    Zhu, Hao
    Mao, Yihuan
    Fang, Hao-Shu
    Lu, Cewu
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10855 - 10864