Pose-native Neural Architecture Search for Multi-person Pose Estimation

被引:9
|
作者
Bao, Qian [1 ]
Liu, Wu [1 ]
Hong, Jun [1 ]
Duan, Lingyu [2 ]
Mei, Tao [1 ]
机构
[1] AI Res JD Com, Beijing, Peoples R China
[2] Peking Univ, Natl Engn Lab Video Technol, Beijing, Peoples R China
关键词
Multi-person pose estimation; Neural architecture search;
D O I
10.1145/3394171.3413842
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-person pose estimation has achieved great progress in recent years, even though, the precise prediction for occluded and invisible hard keypoints remains challenging. Most of the human pose estimation networks are equipped with an image classification-based pose encoder for feature extraction and a handcrafted pose decoder for high-resolution representations. However, the pose encoder might be sub-optimal because of the gap between image classification and pose estimation. The widely used multi-scale feature fusion in pose decoder is still coarse and cannot provide sufficient high-resolution details for hard keypoints. Neural Architecture Search (NAS) has shown great potential in many visual tasks to automatically search efficient networks. In this work, we present the Pose-native Network Architecture Search (PoseNAS) to simultaneously design a better pose encoder and pose decoder for pose estimation. Specifically, we directly search a data-oriented pose encoder with stacked searchable cells, which can provide an optimum feature extractor for the pose specific task. In the pose decoder, we exploit scale-adaptive fusion cells to promote rich information exchange across the multi-scale feature maps. Meanwhile, the pose decoder adopts a Fusion-and-Enhancement manner to progressively boost the high-resolution representations that are non-trivial for the precious prediction of hard keypoints. With the exquisitely designed search space and search strategy, PoseNAS can simultaneously search all modules in an end-to-end manner. PoseNAS achieves state-of-the-art performance on three public datasets, MPII, COCO, and PoseTrack, with small-scale parameters compared with the existing methods. Our best model obtains 76.7% mAP and 75.9% mAP on the COCO validation set and test set with only 33.6M parameters. Code and implementation are available at https://github.com/for-code0216/PoseNAS.
引用
收藏
页码:592 / 600
页数:9
相关论文
共 50 条
  • [21] Cascaded Pyramid Network for Multi-Person Pose Estimation
    Chen, Yilun
    Wang, Zhicheng
    Peng, Yuxiang
    Zhang, Zhiqiang
    Yu, Gang
    Sun, Jian
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 7103 - 7112
  • [22] PoseTrack: Joint Multi-Person Pose Estimation and Tracking
    Iqbal, Umar
    Milan, Anton
    Gall, Juergen
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4654 - 4663
  • [23] Towards Accurate Multi-person Pose Estimation in the Wild
    Papandreou, George
    Zhu, Tyler
    Kanazawa, Nori
    Toshev, Alexander
    Tompson, Jonathan
    Bregler, Chris
    Murphy, Kevin
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 3711 - 3719
  • [24] Multi-person Pose Estimation in Complex Physical Interactions
    Guo, Wen
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 4752 - 4755
  • [25] Causal Intervention Learning for Multi-person Pose Estimation
    Yue, Luhui
    Li, Junxia
    Liu, Qingshan
    [J]. PATTERN RECOGNITION, ACPR 2021, PT I, 2022, 13188 : 182 - 194
  • [26] DetPoseNet: Improving Multi-Person Pose Estimation via Coarse-Pose Filtering
    Ke, Lipeng
    Chang, Ming-Ching
    Qi, Honggang
    Lyu, Siwei
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 2782 - 2795
  • [27] Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation
    Liu, Huan
    Chen, Qiang
    Tan, Zichang
    Liu, Jiang-Jiang
    Wang, Jian
    Su, Xiangbo
    Li, Xiaolong
    Yao, Kun
    Han, Junyu
    Ding, Errui
    Zhao, Yao
    Wang, Jingdong
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14983 - 14992
  • [28] Multi-person Pose Estimation with Local Joint-to-Person Associations
    Iqbal, Umar
    Gall, Juergen
    [J]. COMPUTER VISION - ECCV 2016 WORKSHOPS, PT II, 2016, 9914 : 627 - 642
  • [29] Multi-person pose estimation based on graph grouping optimization
    Zeng, Qingzhi
    Hu, Yingsong
    Li, Dan
    Sun, Dongya
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (05) : 7039 - 7053
  • [30] The Network Improvement and Connection Refinement for Multi-Person Pose Estimation
    Qiao, Huixiang
    Tian, Jiahao
    Xu, Ying
    Zhang, Jiahuan
    Zhao, Zhongjie
    Peng, Chengbin
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD 2019), 2019, : 414 - 418