Wide Range Head Pose Estimation Using a Single RGB Camera for Intelligent Surveillance

被引:9
|
作者
Rahmaniar, Wahyu [1 ]
ul Haq, Qazi Mazhar [1 ]
Lin, Ting-Lan [1 ,2 ]
机构
[1] Natl Taipei Univ Technol, Dept Elect Engn, Taipei 10608, Taiwan
[2] Chung Yuan Christian Univ, Dept Elect Engn, Taoyuan 320314, Taiwan
关键词
Head; Pose estimation; Feature extraction; Three-dimensional displays; Magnetic heads; Deep learning; Real-time systems; CNN; coarse-fine classification; deep learning; Euler angle; head pose estimation; intelligent surveillance;
D O I
10.1109/JSEN.2022.3168863
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Head pose estimation is one of the sensing systems needed for some intelligent surveillance, such as human behavior analysis, intelligent driver assistance, visual attention, and monitoring. These systems require accurate alignment and head movement direction prediction. The previous methods are greatly dependent on the facial landmarks and depth information. Usually, the head pose is measured by estimating several keypoints that require a correct head pose mapping to get accurate results. Moreover, facial landmarks have a detrimental effect on performance when the face is occluded or not adequately visualized. This paper proposes a method for head pose estimation of various facial conditions, such as occlusion and challenging viewpoints. We present a combination of coarse and fine feature maps classification to train a multi-loss deep Convolutional Neural Networks (CNN) to get precise Euler angles (yaw, pitch, roll) of the head position without keypoints and landmarks. Our proposed method uses more quantization units for angle classification to learn coarse and fine structure mapping for better spatial clustering features on an RGB image of a single camera. The experiments are performed on benchmark datasets and some head poses in real cases. The mean average error of prediction is 5.06 degrees, 4.06 degrees, and 2.96 degrees, for the AFLW2000, AFLW, and BIWI datasets, which significantly improves the head pose estimation performance compared to the previous methods. Additionally, the proposed method outperforms previous approaches in computation time of 11 frames per second that is beneficial for real-life applications.
引用
收藏
页码:11112 / 11121
页数:10
相关论文
共 50 条
  • [1] Illumination invariant head pose estimation using single camera
    Nanda, H
    Fujimura, K
    IEEE IV2003: INTELLIGENT VEHICLES SYMPOSIUM, PROCEEDINGS, 2003, : 434 - 437
  • [2] Camera Pose Estimation using Human Head Pose Estimation
    Fischer, Robert
    Hoedlmoser, Michael
    Gelautz, Margrit
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 877 - 886
  • [3] Evaluation of Camera Pose Estimation Using Human Head Pose Estimation
    Fischer R.
    Hödlmoser M.
    Gelautz M.
    SN Computer Science, 4 (3)
  • [4] Head Pose Free 3D Gaze Estimation Using RGB-D Camera
    Kacete, Amine
    Seguier, Renaud
    Collobert, Michel
    Royan, Jerome
    EIGHTH INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2016), 2017, 10225
  • [5] Precision Relative Pose Estimation at Close Range Using Single Camera and Laser Range Finder
    Wang, Hailiang
    Xiang, Maosheng
    You, Hongjian
    Wei, Lideng
    Wu, Yirong
    2008 INTERNATIONAL CONFERENCE ON OPTICAL INSTRUMENTS AND TECHNOLOGY: OPTOELECTRONIC MEASUREMENT TECHNOLOGY AND APPLICATIONS, 2009, 7160
  • [6] On the representation and methodology for wide and short range head pose estimation
    Cobo, Alejandro
    Valle, Roberto
    Buenaposada, Jose M.
    Baumela, Luis
    PATTERN RECOGNITION, 2024, 149
  • [7] RGB-D CAMERA POSE ESTIMATION USING DEEP NEURAL NETWORK
    Guo, Fei
    He, Yifeng
    Guan, Ling
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 408 - 412
  • [8] BOOSTED HUMAN HEAD POSE ESTIMATION USING KINECT CAMERA
    Saeed, Anwar
    Al-Hamadi, Ayoub
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 1752 - 1756
  • [9] Back To RGB: Deep Articulated Hand Pose Estimation From a Single Camera Image
    Ma, Wan-Duo Kurt
    Lewis, J. P.
    Frean, Marcus
    Balduzzi, David
    2017 INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2017,
  • [10] Voxel-Based Scene Representation for Camera Pose Estimation of a Single RGB Image
    Lee, Sangyoon
    Hong, Hyunki
    Eem, Changkyoung
    APPLIED SCIENCES-BASEL, 2020, 10 (24): : 1 - 15