Aggregated pyramid gating network for human pose estimation without pre-training

被引:8
|
作者
Jiang, Chenru [1 ,2 ]
Huang, Kaizhu [3 ]
Zhang, Shufei [4 ]
Wang, Xinheng [2 ]
Xiao, Jimin [2 ]
Goulermas, Yannis [1 ]
机构
[1] Univ Liverpool, Dept Comp Sci, Liverpool L69 7ZX, England
[2] Xian Jiaotong Liverpool Univ, Dept Elect & Elect Engn, Suzhou 215123, Peoples R China
[3] Duke Kunshan Univ, Data Sci Res Ctr, Kunshan, Duke Ave 8, Suzhou 215316, Peoples R China
[4] Shanghai Artificial Intelligence Lab, 37th floor, AI Tower, 701 Yunjin Rd, Shanghai, Peoples R China
关键词
Pyramid gating system; Stabilization; Human pose estimation;
D O I
10.1016/j.patcog.2023.109429
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a comprehensive aggregated residual gating structure, the Pyramid GAting Net-work (PGA-Net) for human pose estimation which can select, distill, and fuse semantic level and natural level information from multiple scales. In comparison, through utilizing multi-scale features, most ex -isting state-of-the-art pose estimation methods are still limited in three aspects. First, multi-scale fea-tures contain massively redundant information, which is unfortunately not distilled by most existing approaches. Second, preferring deeper network structures to extract strong semantic features, the con-ventional methods often ignore original texture information fusion. Third, to attain a good parameter initialization, the current methods heavily rely on pre-training, which is very time-consuming or even unavailable. While better coping with the above problems, our proposed PGA-Net distills high-level se-mantic features and replenishes low-level original information to reinforce module representation capa-bility. Meanwhile, PGA-Net demonstrates notable training stability and superior performance even with-out pre-training. Extensive experiments demonstrate that our method consistently outperforms previous approaches even without pre-training, enabling thus an end-to-end model training from scratch. In COCO benchmark, PGA-Net consistently achieves over 3% improvements than the baseline (without pre-training) under various model configurations.1 (c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [11] A Text Image Super-Resolution Generation Network without Pre-training
    Zhang, Qingyong
    Ye, Ziliu
    Leng Zhiwen
    Yue, Qi
    2020 35TH YOUTH ACADEMIC ANNUAL CONFERENCE OF CHINESE ASSOCIATION OF AUTOMATION (YAC), 2020, : 515 - 519
  • [12] Cascaded Pyramid Network for Multi-Person Pose Estimation
    Chen, Yilun
    Wang, Zhicheng
    Peng, Yuxiang
    Zhang, Zhiqiang
    Yu, Gang
    Sun, Jian
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7103 - 7112
  • [13] Improving Monocular Depth Estimation by Semantic Pre-training
    Rottmann, Peter
    Posewsky, Thorbjorn
    Milioto, Andres
    Stachniss, Cyrill
    Behley, Jens
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 5916 - 5923
  • [14] Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation
    Tompson, Jonathan
    Jain, Arjun
    LeCun, Yann
    Bregler, Christoph
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [15] Multi-person Pose Estimation for Pose Tracking with Enhanced Cascaded Pyramid Network
    Yu, Dongdong
    Su, Kai
    Sun, Jia
    Wang, Changhu
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT II, 2019, 11130 : 221 - 226
  • [16] PVSPE: A pyramid vision multitask transformer network for spacecraft pose estimation
    Yang, Hong
    Xiao, Xueming
    Yao, Meibao
    Xiong, Yonggang
    Cui, Hutao
    Fu, Yuegang
    ADVANCES IN SPACE RESEARCH, 2024, 74 (03) : 1327 - 1342
  • [17] JointGraph: joint pre-training framework for traffic forecasting with spatial-temporal gating diffusion graph attention network
    Kong, Xiangyuan
    Wei, Xiang
    Zhang, Jian
    Xing, Weiwei
    Lu, Wei
    APPLIED INTELLIGENCE, 2023, 53 (11) : 13723 - 13740
  • [18] JointGraph: joint pre-training framework for traffic forecasting with spatial-temporal gating diffusion graph attention network
    Xiangyuan Kong
    Xiang Wei
    Jian Zhang
    Weiwei Xing
    Wei Lu
    Applied Intelligence, 2023, 53 : 13723 - 13740
  • [19] Hardmining Training via Self-Adversarial Network for Human Pose Estimation
    Zhang, Sai
    Zhu, Aichun
    Cao, Qinfeng
    Tang, Shiyu
    Cui, Ran
    Wang, Tian
    Hua, Gang
    Xu, Zhenyu
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 3717 - 3721
  • [20] Pre-training Summarization Models of Structured Datasets for Cardinality Estimation
    Lu, Yao
    Kandula, Srikanth
    Konig, Arnd Christian
    Chaudhuri, Surajit
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 15 (03): : 414 - 426