Improved DenseNet network for human pose estimation

被引:0
|
作者
Shi Y.-X. [1 ,2 ]
Xu X.-Q. [1 ]
机构
[1] College of Information Engineering, Xiangtan University, Xiangtan
[2] LED Lighting Research and Technology Center of Guizhou, Tongren
来源
Kongzhi yu Juece/Control and Decision | 2021年 / 36卷 / 05期
关键词
Deep learning; DenseNet; Human pose estimation; Keypoints detection; Multi-scale feature; Scale transformation;
D O I
10.13195/j.kzyjc.2019.1218
中图分类号
学科分类号
摘要
In order to solve the problem that the impact of speed, and the poor detection performance of the common keypoints detection method caused by the uncertain number of people in the image and the relative size of different human bodies or body parts, an improved DenseNet network structure is proposed for human pose estimation. This network structure is a single-stage and end-to-end network, which uses deep convolutional neural networks for feature extraction. At the end of the convolutional network, it can get 6 different scales of feature maps by using a specific scale-transfer structure. Then the network can integrate different levels of features for multi-scale keypoints detection, which effectively improves the detection accuracy of keypoints. The bottom-up approach is adopted to ensure the processing speed of the multi-person pose estimation task. Experiments show that this method improves the mean average precision of multi-person keypoints detection by 1 % compared with other general methods. It provides a new method for balancing the speed and accuracy of attitude estimation. Copyright ©2021 Control and Decision.
引用
收藏
页码:1206 / 1212
页数:6
相关论文
共 17 条
  • [1] Tian Y D, Zitnick C L, Narasimhan S G., Exploring the spatial hierarchy of mixture models for human pose estimation, Proceedings of European Conference on Computer Vision, pp. 256-269, (2012)
  • [2] Sapp B, Taskar B., MODEC: Multimodal decomposable models for human pose estimation, Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3674-3681, (2013)
  • [3] Toshev A, Szegedy C., DeepPose: Human pose estimation via deep neural networks, Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1653-1660, (2014)
  • [4] Wei S E, Ramakrishna V, Kanade T, Et al., Convolutional pose machines, Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4724-4732, (2016)
  • [5] Newell A, Yang K, Deng J., Stacked hourglass networks for human pose estimation, Proceedings of European Conference on Computer Vision, pp. 483-499, (2016)
  • [6] Chou C J, Chien J T, Chen J T., Self adversarial training for human pose estimation, (2017)
  • [7] Insafutdinov E, Pishchulin L, Andres B, Et al., DeeperCut: A deeper, stronger, and faster multi-person pose estimation model, Proceedings of European Conference on Computer Vision, pp. 34-50, (2016)
  • [8] Insafutdinov E, Andriluka M, Pishchulin L, Et al., ArtTrack: Articulated multi-person tracking in the wild, Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1293-1301, (2017)
  • [9] Cao Z, Simon T, Wei S E, Et al., Realtime multi-person 2D pose estimation using part affinity fields, Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1302-1310, (2017)
  • [10] Huang G, Liu Z, Laurens V D M, Et al., Densely connected convolutional networks, Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4700-4708, (2017)