Image-free Domain Generalization via CLIP for 3D Hand Pose Estimation

被引:7
|
作者
Lee, Seongyeong [1 ,2 ]
Park, Hansoo [1 ]
Kim, Dong Uk [1 ]
Kim, Jihyeon [1 ]
Boboev, Muhammadjon [1 ]
Baek, Seungryul [1 ]
机构
[1] UNIST, Ulsan, South Korea
[2] NC Soft, Seongnam, South Korea
关键词
D O I
10.1109/WACV56688.2023.00295
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-based 3D hand pose estimation has been successful for decades thanks to large-scale databases and deep learning. However, the hand pose estimation network does not operate well for hand pose images whose characteristics are far different from the training data. This is caused by various factors such as illuminations, camera angles, diverse backgrounds in the input images, etc. Many existing methods tried to solve it by supplying additional large-scale unconstrained/target domain images to augment data space; however collecting such large-scale images takes a lot of labors. In this paper, we present a simple image-free domain generalization approach for the hand pose estimation framework that uses only source domain data. We try to manipulate the image features of the hand pose estimation network by adding the features from text descriptions using the CLIP (Contrastive Language-Image Pre-training) model. The manipulated image features are then exploited to train the hand pose estimation network via the contrastive learning framework. In experiments with STB and RHD datasets, our algorithm shows improved performance over the state-of-the-art domain generalization approaches.
引用
收藏
页码:2933 / 2943
页数:11
相关论文
共 50 条
  • [21] Residual Attention Regression for 3D Hand Pose Estimation
    Li, Jing
    Zhang, Long
    Ju, Zhaojie
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT IV, 2019, 11743 : 605 - 614
  • [22] Bayesian Image Based 3D Pose Estimation
    Sanzari, Marta
    Ntouskos, Valsamis
    Pirri, Fiora
    COMPUTER VISION - ECCV 2016, PT VIII, 2016, 9912 : 566 - 582
  • [23] 3D Hand Pose Estimation Based on External Attention
    Li, Shoukun
    Pan, Xiaoying
    Wang, Beibei
    Gao, Jialong
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 490 - 495
  • [24] Aligning Latent Spaces for 3D Hand Pose Estimation
    Yang, Linlin
    Li, Shile
    Lee, Dongheui
    Yao, Angela
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2335 - 2343
  • [25] PEAN: 3D Hand Pose Estimation Adversarial Network
    Sun, Linhui
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1251 - 1258
  • [26] CASCADED POINT NETWORK FOR 3D HAND POSE ESTIMATION
    Dou, Yikun
    Wang, Xuguang
    Zhu, Yuying
    Deng, Xiaoming
    Ma, Cuixia
    Chang, Liang
    Wang, Hongan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1982 - 1986
  • [27] Database indexing methods for 3D hand pose estimation
    Athitsos, V
    Sclaroff, S
    GESTURE-BASED COMMUNICATION IN HUMAN-COMPUTER INTERACTION, 2003, 2915 : 288 - 299
  • [28] 3D Hand Pose Estimation on Conventional Capacitive Touchscreens
    Choi, Frederick
    Mayer, Sven
    Harrison, Chris
    PROCEEDINGS OF 23RD ACM INTERNATIONAL CONFERENCE ON MOBILE HUMAN-COMPUTER INTERACTION (MOBILEHCI 2021): MOBILE APART, MOBILE TOGETHER, 2021,
  • [29] 3D Hand Pose Estimation in Everyday Egocentric Images
    Prakash, Aditya
    Tu, Ruisen
    Chang, Matthew
    Gupta, Saurabh
    COMPUTER VISION - ECCV 2024, PT LXXVIII, 2025, 15136 : 183 - 202
  • [30] 3D hand pose estimation from a single RGB image by weighting the occlusion and classification
    Mahdikhanlou, Khadijeh
    Ebrahimnezhad, Hossein
    PATTERN RECOGNITION, 2023, 136