Aggregated pyramid gating network for human pose estimation without pre-training

被引:8
|
作者
Jiang, Chenru [1 ,2 ]
Huang, Kaizhu [3 ]
Zhang, Shufei [4 ]
Wang, Xinheng [2 ]
Xiao, Jimin [2 ]
Goulermas, Yannis [1 ]
机构
[1] Univ Liverpool, Dept Comp Sci, Liverpool L69 7ZX, England
[2] Xian Jiaotong Liverpool Univ, Dept Elect & Elect Engn, Suzhou 215123, Peoples R China
[3] Duke Kunshan Univ, Data Sci Res Ctr, Kunshan, Duke Ave 8, Suzhou 215316, Peoples R China
[4] Shanghai Artificial Intelligence Lab, 37th floor, AI Tower, 701 Yunjin Rd, Shanghai, Peoples R China
关键词
Pyramid gating system; Stabilization; Human pose estimation;
D O I
10.1016/j.patcog.2023.109429
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we propose a comprehensive aggregated residual gating structure, the Pyramid GAting Net-work (PGA-Net) for human pose estimation which can select, distill, and fuse semantic level and natural level information from multiple scales. In comparison, through utilizing multi-scale features, most ex -isting state-of-the-art pose estimation methods are still limited in three aspects. First, multi-scale fea-tures contain massively redundant information, which is unfortunately not distilled by most existing approaches. Second, preferring deeper network structures to extract strong semantic features, the con-ventional methods often ignore original texture information fusion. Third, to attain a good parameter initialization, the current methods heavily rely on pre-training, which is very time-consuming or even unavailable. While better coping with the above problems, our proposed PGA-Net distills high-level se-mantic features and replenishes low-level original information to reinforce module representation capa-bility. Meanwhile, PGA-Net demonstrates notable training stability and superior performance even with-out pre-training. Extensive experiments demonstrate that our method consistently outperforms previous approaches even without pre-training, enabling thus an end-to-end model training from scratch. In COCO benchmark, PGA-Net consistently achieves over 3% improvements than the baseline (without pre-training) under various model configurations.1 (c) 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Pay Attention Selectively and Comprehensively: Pyramid Gating Network for Human Pose Estimation without Pre-training
    Jiang, Chenru
    Huang, Kaizhu
    Zhang, Shufei
    Wang, Xinheng
    Xiao, Jimin
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2364 - 2371
  • [2] Complementary Feature Pyramid Network for Human Pose Estimation
    Cheng, Yanhao
    Liu, Weibin
    Xing, Weiwei
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [3] Human Pose Estimation Based Pre-training Model and Efficient High-Resolution Representation
    Wen, Jinchen
    Chi, Jianning
    Wu, Chengdong
    Yu, Xiaosheng
    [J]. 2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 8463 - 8468
  • [4] Weakly-supervised pre-training for 3D human pose estimation via perspective knowledge
    Qiu, Zhongwei
    Qiu, Kai
    Fu, Jianlong
    Fu, Dongmei
    [J]. PATTERN RECOGNITION, 2023, 139
  • [5] A Lightweight Network Based on Pyramid Residual Module for Human Pose Estimation
    Bingkun Gao
    Ke Ma
    Hongbo Bi
    Ling Wang
    [J]. Pattern Recognition and Image Analysis, 2019, 29 : 668 - 675
  • [6] A Lightweight Network Based on Pyramid Residual Module for Human Pose Estimation
    Gao, Bingkun
    Ma, Ke
    Bi, Hongbo
    Wang, Ling
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2019, 29 (04) : 668 - 675
  • [7] Densely connected attentional pyramid residual network for human pose estimation
    Tian, Yan
    Hu, Wei
    Jiang, Hangsen
    Wu, Jiachen
    [J]. NEUROCOMPUTING, 2019, 347 : 13 - 23
  • [8] Pre-Training Without Natural Images
    Kataoka, Hirokatsu
    Okayasu, Kazushige
    Matsumoto, Asato
    Yamagata, Eisuke
    Yamada, Ryosuke
    Inoue, Nakamasa
    Nakamura, Akio
    Satoh, Yutaka
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (04) : 990 - 1007
  • [9] Pre-Training Without Natural Images
    Hirokatsu Kataoka
    Kazushige Okayasu
    Asato Matsumoto
    Eisuke Yamagata
    Ryosuke Yamada
    Nakamasa Inoue
    Akio Nakamura
    Yutaka Satoh
    [J]. International Journal of Computer Vision, 2022, 130 : 990 - 1007
  • [10] Erratum to: A Lightweight Network Based on Pyramid Residual Module for Human Pose Estimation
    Bingkun Gao
    Ke Ma
    Hongbo Bi
    Ling Wang
    [J]. Pattern Recognition and Image Analysis, 2020, 30 : 565 - 565