Head pose estimation with uncertainty and an application to dyadic interaction detection

被引:2
|
作者
Tomenotti, Federico Figari [1 ]
Noceti, Nicoletta [1 ]
Odone, Francesca [1 ]
机构
[1] Univ Genoa, MaLGa DIBRIS, Via Dodecaneso 35, I-16146 Genoa, Italy
关键词
Head pose estimation; Multi-task regression; Neural networks; Heteroscedastic uncertainty; Dyadic interaction detection; PEOPLE LOOKING; GAZE; COMMUNICATION; MODEL;
D O I
10.1016/j.cviu.2024.103999
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Determining the visual focus of attention of people in a scene is a fundamental cue to understand social interactions from videos. Gaze direction is ideal for determining eye contact, a basic cue of non-verbal communication, but it is not always easy to recognize. Head direction is a well-known proxy of gaze direction, more robust to the variability of the scene, thus offering a valuable alternative. In this work, we consider HHP-net, a method for estimating the head direction from single frames based on a heteroscedastic neural network to estimate people's head pose from a minimal set of head key points. We formulate the problem as a multi -task regression, to predict the pose as a triplet of Euler angles from the output of a 2D pose estimator. HHP-net also provides a measure of the aleatoric heteroscedastic uncertainties associated with the angles, through an ad -hoc loss function we introduce. In a thorough experimental analysis, we show that our model is efficient and effective compared with the state of the art, with only similar to 2 degrees of degradation in the worst case counterbalanced by a space occupation similar to 12 times smaller. We also show the beneficial effects of uncertainty on interpretability. Finally, we discuss the robustness of our method to input variability, showing that it can be seen as a plug-in to different pose estimators. As a proof -of -concept, we address social interaction analysis, with an algorithm to detect dyadic interactions in images.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Driver Distraction Detection Method Based on Continuous Head Pose Estimation
    Zhao, Zuopeng
    Xia, Sili
    Xu, Xinzheng
    Zhang, Lan
    Yan, Hualin
    Xu, Yi
    Zhang, Zhongxin
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2020, 2020
  • [22] Detection of Malpractice in E-exams by Head Pose and Gaze Estimation
    Indi, Chirag S.
    Pritham, K. C. S. Varun
    Acharya, Vasundhara
    Prakasha, Krishna
    INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2021, 16 (08) : 47 - 60
  • [23] Fast v ace detection and head pose estimation based BMPCA
    Yan, Yunyang
    Guo, Zhibo
    Yang, Jingyu
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2007, 14 : 302 - 306
  • [24] HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty
    Cantarini, Giorgio
    Tomenotti, Federico Figari L.
    Noceti, Nicoletta
    Odone, Francesca
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3341 - 3350
  • [25] Head Pose Estimation on Eyeglasses Using Line Detection and Classification Approach
    Setthawong, Pisal
    Vannija, Vajirasak
    ADVANCES IN INFORMATION TECHNOLOGY, 2010, 114 : 126 - 136
  • [26] Detection of head pose and gaze direction for human-computer interaction
    Weidenbacher, Ulrich
    Layher, Georg
    Bayerl, Pierre
    Neumann, Heiko
    PERCEPTION AND INTERACTIVE TECHNOLOGIES, PROCEEDINGS, 2006, 4021 : 9 - 19
  • [27] Uncertainty in pose estimation: A Bayesian approach
    Callari, FG
    Soucy, G
    Ferrie, FP
    FOURTEENTH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1 AND 2, 1998, : 972 - 976
  • [28] Head Pose Estimation Using a Coplanar Face Model for Human Computer Interaction
    Kim, Jin-Bum
    Kim, Hong-In
    Park, Rae-Hong
    2014 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS (ICCE), 2014, : 562 - 563
  • [29] Head pose estimation using stereo vision for human-robot interaction
    Seemann, E
    Nickel, K
    Stiefelhagen, R
    SIXTH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, PROCEEDINGS, 2004, : 626 - 631
  • [30] WNet: Joint Multiple Head Detection and Head Pose Estimation from a Spectator Crowd Image
    Jan, Yasir
    Sohel, Ferdous
    Shiratuddin, Mohd Fairuz
    Wong, Kok Wai
    COMPUTER VISION - ACCV 2018 WORKSHOPS, 2019, 11367 : 484 - 493