HandFormer: Hand pose reconstructing from a single RGB image

被引：0

作者：

Jiao, Zixun ^{[1
,2
]}

Wang, Xihan ^{[1
,2
]}

Li, Jingcao ^{[1
,2
]}

Gao, Rongxin ^{[1
,2
]}

He, Miao ^{[1
,2
]}

Liang, Jiao ^{[1
,2
]}

Xia, Zhaoqiang ^{[3
]}

Gao, Quanli ^{[1
,2
]}

机构：

[1] State & Local Joint Engn Res Ctr Adv Networking &, Xian 710048, Peoples R China

[2] Xian Polytech Univ, Sch Comp Sci, Informat Serv, Xian 710048, Shaanxi, Peoples R China

[3] Northwestern Polytech Univ, Xian 710072, Shaanxi, Peoples R China

来源：

PATTERN RECOGNITION LETTERS | 2024年 / 183卷

基金：

中国国家自然科学基金;

关键词：

Hand attitude estimation; Hand attitude estimation and segmentation; Multitasking learning; Multitask progressive transformer framework; Multi-scale features;

D O I：

10.1016/j.patrec.2024.05.019

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a multi -task progressive Transformer framework to reconstruct hand poses from a single RGB image to address challenges such as hand occlusion hand distraction, and hand shape bias. Our proposed framework comprises three key components: the feature extraction branch, palm segmentation branch, and parameter prediction branch. The feature extraction branch initially employs the progressive Transformer to extract multiscale features from the input image. Subsequently, these multi-scale features are fed into a multi-layer perceptron layer (MLP) for acquiring palm alignment features. We employ an efficient fusion module to enhance the parameter prediction further features to integrate the palm alignment features with the backbone features. A dense hand model is generated using a pre-computed articulated mesh deformed hand model. We evaluate the performance of our proposed method on STEREO, FreiHAND, and HO3D datasets separately. The experimental results demonstrate that our approach achieves 3D mean error metrics of 10.92 mm, 12.33 mm and 9.6 mm for the respective datasets.

引用

页码：155 / 164

页数：10

共 50 条

[31] Hierarchical topology based hand pose estimation from a single depth image
Ji, Yanli
Li, Haoxin
Yang, Yang
Li, Shuying
MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (09) : 10553 - 10568
[32] Template based Human Pose and Shape Estimation from a Single RGB-D Image
Li, Zhongguo
Heyden, Anders
Oskarsson, Magnus
ICPRAM: PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS, 2019, : 574 - 581
[33] Real-time 6D pose estimation from a single RGB image
Zhang, Xin
Jiang, Zhiguo
Zhang, Haopeng
IMAGE AND VISION COMPUTING, 2019, 89 : 1 - 11
[34] Monocular RGB Hand Pose Inference from Unsupervised Refinable Nets
Dibra, Endri
Melchior, Silvan
Balkis, Ali
Wolf, Thomas
Oeztireli, Cengiz
Gross, Markus
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, : 1188 - 1198
[35] Reconstructing 3D human pose and shape from a single image and sparse IMUs
Liao, Xianhua
Zhuang, Jiayan
Liu, Ze
Dong, Jiayan
Song, Kangkang
Xiao, Jiangjian
PEERJ COMPUTER SCIENCE, 2023, 9
[36] Citrus pose estimation from an RGB image for automated harvesting
Sun, Qixin
Zhong, Ming
Chai, Xiujuan
Zeng, Zhikang
Yin, Hesheng
Zhou, Guomin
Sun, Tan
COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2023, 211
[37] Accurate 3D hand mesh recovery from a single RGB image
Pemasiri A.
Nguyen K.
Sridharan S.
Fookes C.
Scientific Reports, 12 (1)
[38] Simple and effective deep hand shape and pose regression from a single depth image
Malik, Jameel
Elhayek, Ahmed
Nunnari, Fabrizio
Stricker, Didier
COMPUTERS & GRAPHICS-UK, 2019, 85 : 85 - 91
[39] Faster and finer pose estimation for multiple instance objects in a single RGB image
Aing, Lee
Lie, Wen-Nung
Lin, Guo-Shiang
IMAGE AND VISION COMPUTING, 2023, 130
[40] Fast 6D Pose from a Single RGB Image using Cascaded Forests Templates
Munoz, E.
Konishi, Y.
Beltran, C.
Murino, V.
Del Bue, A.
2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), 2016, : 4062 - 4069

← 1 2 3 4 5 →