Orientation Cues-Aware Facial Relationship Representation for Head Pose Estimation via Transformer

被引:74
|
作者
Liu, Hai [1 ]
Zhang, Cheng [1 ]
Deng, Yongjian [2 ]
Liu, Tingting [3 ,4 ]
Zhang, Zhaoli [1 ]
Li, You-Fu [4 ]
机构
[1] Cent China Normal Univ, Natl Engn Res Ctr Elearning, Wuhan 430079, Peoples R China
[2] Beijing Univ Technol, Coll Comp Sci, Beijing 100124, Peoples R China
[3] Hubei Univ, Sch Educ, Wuhan 430062, Hubei, Peoples R China
[4] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China
关键词
Head; Transformers; Visualization; Computer architecture; Pose estimation; Task analysis; Semantics; Head pose estimation; attention mechanism; relationship perception; deep learning; transformer;
D O I
10.1109/TIP.2023.3331309
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Head pose estimation (HPE) is an indispensable upstream task in the fields of human-machine interaction, self-driving, and attention detection. However, practical head pose applications suffer from several challenges, such as severe occlusion, low illumination, and extreme orientations. To address these challenges, we identify three cues from head images, namely, critical minority relationships, neighborhood orientation relationships, and significant facial changes. On the basis of the three cues, two key insights on head poses are revealed: 1) intra-orientation relationship and 2) cross-orientation relationship. To leverage two key insights above, a novel relationship-driven method is proposed based on the Transformer architecture, in which facial and orientation relationships can be learned. Specifically, we design several orientation tokens to explicitly encode basic orientation regions. Besides, a novel token guide multi-loss function is accordingly designed to guide the orientation tokens as they learn the desired regional similarities and relationships. Experimental results on three challenging benchmark HPE datasets show that our proposed TokenHPE achieves state-of-the-art performance. Moreover, qualitative visualizations are provided to verify the effectiveness of the token-learning methodology.
引用
收藏
页码:6289 / 6302
页数:14
相关论文
共 50 条
  • [31] An Entertainment Robot Based on Head Pose Estimation and Facial Expression Recognition
    Takahashi, Koichi
    Mitsukura, Yasue
    2012 PROCEEDINGS OF SICE ANNUAL CONFERENCE (SICE), 2012, : 2057 - 2061
  • [32] NOSE, EYES AND EARS: HEAD POSE ESTIMATION BY LOCATING FACIAL KEYPOINTS
    Gupta, Aryaman
    Thakkar, Kalpit
    Gandhi, Vineet
    Narayanan, P. J.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1977 - 1981
  • [33] Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose
    Yaokun Li
    Guang Tan
    Chao Gou
    International Journal of Computer Vision, 2024, 132 : 1242 - 1257
  • [34] Cascaded Iterative Transformer for Jointly Predicting Facial Landmark, Occlusion Probability and Head Pose
    Li, Yaokun
    Tan, Guang
    Gou, Chao
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (04) : 1242 - 1257
  • [35] Fast Head Pose Estimation via Rotation-Adaptive Facial Landmark Detection for Video Edge Computation
    Wang, Weiwei
    Chen, Xiaoyan
    Zheng, Shuangwu
    Li, Haiqing
    IEEE ACCESS, 2020, 8 (08): : 45023 - 45032
  • [36] 6D ROTATION REPRESENTATION FOR UNCONSTRAINED HEAD POSE ESTIMATION
    Hempel, Thorsten
    Abdelrahman, Ahmed A.
    Al-Hamadi, Ayoub
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2496 - 2500
  • [37] Torso Orientation: A New Clue for Occlusion-Aware Human Pose Estimation
    Yu, Yang
    Yang, Baoyao
    Yuen, Pong C.
    2016 24TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2016, : 908 - 912
  • [38] DecenterNet: Bottom-Up Human Pose Estimation Via Decentralized Pose Representation
    Wang, Tao
    Jin, Lei
    Wang, Zhang
    Fan, Xiaojin
    Cheng, Yu
    Teng, Yinglei
    Xing, Junliang
    Zhao, Jian
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 1798 - 1808
  • [39] TransPose: 6D object pose estimation with geometry-aware Transformer
    Lin, Xiao
    Wang, Deming
    Zhou, Guangliang
    Liu, Chengju
    Chen, Qijun
    NEUROCOMPUTING, 2024, 589
  • [40] Corporal and Facial Cues of Dominance and their Relationship with Sociosexual Orientation in Young Chilean Men
    Polo, Pablo
    Antonio Munoz-Reyes, Jose
    FOLIA PRIMATOLOGICA, 2018, 89 (3-4) : 183 - 183