Image-free Domain Generalization via CLIP for 3D Hand Pose Estimation

被引:7
|
作者
Lee, Seongyeong [1 ,2 ]
Park, Hansoo [1 ]
Kim, Dong Uk [1 ]
Kim, Jihyeon [1 ]
Boboev, Muhammadjon [1 ]
Baek, Seungryul [1 ]
机构
[1] UNIST, Ulsan, South Korea
[2] NC Soft, Seongnam, South Korea
关键词
D O I
10.1109/WACV56688.2023.00295
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-based 3D hand pose estimation has been successful for decades thanks to large-scale databases and deep learning. However, the hand pose estimation network does not operate well for hand pose images whose characteristics are far different from the training data. This is caused by various factors such as illuminations, camera angles, diverse backgrounds in the input images, etc. Many existing methods tried to solve it by supplying additional large-scale unconstrained/target domain images to augment data space; however collecting such large-scale images takes a lot of labors. In this paper, we present a simple image-free domain generalization approach for the hand pose estimation framework that uses only source domain data. We try to manipulate the image features of the hand pose estimation network by adding the features from text descriptions using the CLIP (Contrastive Language-Image Pre-training) model. The manipulated image features are then exploited to train the hand pose estimation network via the contrastive learning framework. In experiments with STB and RHD datasets, our algorithm shows improved performance over the state-of-the-art domain generalization approaches.
引用
收藏
页码:2933 / 2943
页数:11
相关论文
共 50 条
  • [1] CLIP-Hand3D: Exploiting 3D Hand Pose Estimation via Context-Aware Prompting
    Guo, Shaoxiang
    Cai, Qing
    Qi, Lin
    Dong, Junyu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 4896 - 4907
  • [2] 3D Hand Pose Estimation with a Single Infrared Camera via Domain Transfer Learning
    Park, Gabyong
    Kim, Tae-Kyun
    Woo, Woontack
    2020 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR 2020), 2020, : 588 - 599
  • [3] Realistic Depth Image Synthesis for 3D Hand Pose Estimation
    Zhou, Jun
    Xu, Chi
    Ge, Yuting
    Cheng, Li
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5246 - 5256
  • [4] Review on 3D Hand Pose Estimation Based on a RGB Image
    Xiao Y.
    Liu Y.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2024, 36 (02): : 161 - 172
  • [5] Cross-Domain 3D Hand Pose Estimation with Dual Modalities
    Lin, Qiuxia
    Yang, Linlin
    Yao, Angela
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 17184 - 17193
  • [6] A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation
    Peng, Qucheng
    Zheng, Ce
    Chen, Chen
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 2240 - 2249
  • [7] 3D Hand Shape and Pose Estimation from a Single RGB Image
    Ge, Liuhao
    Ren, Zhou
    Li, Yuncheng
    Xue, Zehao
    Wang, Yingying
    Cai, Jianfei
    Yuan, Junsong
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10825 - 10834
  • [8] 3D Hand Pose Estimation via Graph-Based Reasoning
    Song, Jae-Hun
    Kang, Suk-Ju
    IEEE ACCESS, 2021, 9 : 35824 - 35833
  • [9] Temporal Hints in 3D Hand Pose Estimation
    Yu, Taidong
    Cao, Zhiguo
    Xiao, Yang
    Zhang, Boshen
    Zhu, Zihao
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 2042 - 2047
  • [10] Dense 3D Regression for Hand Pose Estimation
    Wan, Chengde
    Probst, Thomas
    Van Gool, Luc
    Yao, Angela
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 5147 - 5156