HandFormer: Hand pose reconstructing from a single RGB image

被引:0
|
作者
Jiao, Zixun [1 ,2 ]
Wang, Xihan [1 ,2 ]
Li, Jingcao [1 ,2 ]
Gao, Rongxin [1 ,2 ]
He, Miao [1 ,2 ]
Liang, Jiao [1 ,2 ]
Xia, Zhaoqiang [3 ]
Gao, Quanli [1 ,2 ]
机构
[1] State & Local Joint Engn Res Ctr Adv Networking &, Xian 710048, Peoples R China
[2] Xian Polytech Univ, Sch Comp Sci, Informat Serv, Xian 710048, Shaanxi, Peoples R China
[3] Northwestern Polytech Univ, Xian 710072, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Hand attitude estimation; Hand attitude estimation and segmentation; Multitasking learning; Multitask progressive transformer framework; Multi-scale features;
D O I
10.1016/j.patrec.2024.05.019
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a multi -task progressive Transformer framework to reconstruct hand poses from a single RGB image to address challenges such as hand occlusion hand distraction, and hand shape bias. Our proposed framework comprises three key components: the feature extraction branch, palm segmentation branch, and parameter prediction branch. The feature extraction branch initially employs the progressive Transformer to extract multiscale features from the input image. Subsequently, these multi-scale features are fed into a multi-layer perceptron layer (MLP) for acquiring palm alignment features. We employ an efficient fusion module to enhance the parameter prediction further features to integrate the palm alignment features with the backbone features. A dense hand model is generated using a pre-computed articulated mesh deformed hand model. We evaluate the performance of our proposed method on STEREO, FreiHAND, and HO3D datasets separately. The experimental results demonstrate that our approach achieves 3D mean error metrics of 10.92 mm, 12.33 mm and 9.6 mm for the respective datasets.
引用
收藏
页码:155 / 164
页数:10
相关论文
共 50 条
  • [31] Hierarchical topology based hand pose estimation from a single depth image
    Ji, Yanli
    Li, Haoxin
    Yang, Yang
    Li, Shuying
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (09) : 10553 - 10568
  • [32] Template based Human Pose and Shape Estimation from a Single RGB-D Image
    Li, Zhongguo
    Heyden, Anders
    Oskarsson, Magnus
    ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2019, : 574 - 581
  • [33] Real-time 6D pose estimation from a single RGB image
    Zhang, Xin
    Jiang, Zhiguo
    Zhang, Haopeng
    IMAGE AND VISION COMPUTING, 2019, 89 : 1 - 11
  • [34] Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets
    Dibra, Endri
    Melchior, Silvan
    Balkis, Ali
    Wolf, Thomas
    Oeztireli, Cengiz
    Gross, Markus
    PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1188 - 1198
  • [35] Reconstructing 3D human pose and shape from a single image and sparse IMUs
    Liao, Xianhua
    Zhuang, Jiayan
    Liu, Ze
    Dong, Jiayan
    Song, Kangkang
    Xiao, Jiangjian
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [36] Citrus pose estimation from an RGB image for automated harvesting
    Sun, Qixin
    Zhong, Ming
    Chai, Xiujuan
    Zeng, Zhikang
    Yin, Hesheng
    Zhou, Guomin
    Sun, Tan
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 211
  • [37] Accurate 3D hand mesh recovery from a single RGB image
    Pemasiri A.
    Nguyen K.
    Sridharan S.
    Fookes C.
    Scientific Reports, 12 (1)
  • [38] Simple and effective deep hand shape and pose regression from a single depth image
    Malik, Jameel
    Elhayek, Ahmed
    Nunnari, Fabrizio
    Stricker, Didier
    COMPUTERS & GRAPHICS-UK, 2019, 85 : 85 - 91
  • [39] Faster and finer pose estimation for multiple instance objects in a single RGB image
    Aing, Lee
    Lie, Wen-Nung
    Lin, Guo-Shiang
    IMAGE AND VISION COMPUTING, 2023, 130
  • [40] Fast 6D Pose from a Single RGB Image using Cascaded Forests Templates
    Munoz, E.
    Konishi, Y.
    Beltran, C.
    Murino, V.
    Del Bue, A.
    2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 4062 - 4069