Pixel-in-Pixel Net: Towards Efficient Facial Landmark Detection in the Wild

被引:69
|
作者
Jin, Haibo [1 ]
Liao, Shengcai [1 ]
Shao, Ling [1 ,2 ]
机构
[1] Incept Inst Artificial Intelligence IIAI, Abu Dhabi, U Arab Emirates
[2] Mohamed Bin Zayed Univ Artificial Intelligence MB, Abu Dhabi, U Arab Emirates
关键词
Facial landmark detection; Pixel-in-pixel regression; Self-training with curriculum; Unsupervised domain adaptation; REPRESENTATION; NETWORK;
D O I
10.1007/s11263-021-01521-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, heatmap regression models have become popular due to their superior performance in locating facial landmarks. However, three major problems still exist among these models: (1) they are computationally expensive; (2) they usually lack explicit constraints on global shapes; (3) domain gaps are commonly present. To address these problems, we propose Pixel-in-Pixel Net (PIPNet) for facial landmark detection. The proposed model is equipped with a novel detection head based on heatmap regression, which conducts score and offset predictions simultaneously on low-resolution feature maps. By doing so, repeated upsampling layers are no longer necessary, enabling the inference time to be largely reduced without sacrificing model accuracy. Besides, a simple but effective neighbor regression module is proposed to enforce local constraints by fusing predictions from neighboring landmarks, which enhances the robustness of the new detection head. To further improve the cross-domain generalization capability of PIPNet, we propose self-training with curriculum. This training strategy is able to mine more reliable pseudo-labels from unlabeled data across domains by starting with an easier task, then gradually increasing the difficulty to provide more precise labels. Extensive experiments demonstrate the superiority of PIPNet, which obtains new state-of-the-art results on three out of six popular benchmarks under the supervised setting. The results on two cross-domain test sets are also consistently improved compared to the baselines. Notably, our lightweight version of PIPNet runs at 35.7 FPS and 200 FPS on CPU and GPU, respectively, while still maintaining a competitive accuracy to state-of-the-art methods. The code of PIPNet is available at https://github.com/jhb86253817/PIPNet.
引用
收藏
页码:3174 / 3194
页数:21
相关论文
共 50 条
  • [11] COLOR FACIAL IMAGE DENOISING BASED ON RPCA AND NOISY PIXEL DETECTION
    Yuan, Zhaojun
    Xie, Xudong
    Ma, Xiaolong
    Lam, Kin-Man
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 2449 - 2453
  • [12] Pixel comparison arithmetic for facial feature detection system and its application
    Li, JK
    Chen, SW
    Zhang, R
    ICEMI 2005: Conference Proceedings of the Seventh International Conference on Electronic Measurement & Instruments, Vol 6, 2005, : 352 - 354
  • [13] Pixel-level Crack Detection using U-Net
    Cheng, Jierong
    Xiong, Wei
    Chen, Wenyu
    Gu, Ying
    Li, Yusha
    PROCEEDINGS OF TENCON 2018 - 2018 IEEE REGION 10 CONFERENCE, 2018, : 0462 - 0466
  • [14] Enhancement of Criminal Facial Image Using Multistage Progressive V-Net for Facial Recognition by Pixel Restoration
    Benslet, S. S. Beulah
    Parameswari, P.
    EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2024, 11 (03): : 1 - 12
  • [15] Towards Accurate Facial Landmark Detection via Cascaded Transformers
    Li, Hui
    Guo, Zidong
    Rhee, Seon-Min
    Han, Seungju
    Han, Jae-Joon
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4166 - 4175
  • [16] Constrained Local Neural Fields for robust facial landmark detection in the wild
    Baltrusaitis, Tadas
    Robinson, Peter
    Morency, Louis-Philippe
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2013, : 354 - 361
  • [17] Real-time Efficient Facial Landmark Detection Algorithms
    Xiong, Hanying
    Lu, Tongwei
    Zhang, Hongzhi
    AIPR 2020: 2020 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND PATTERN RECOGNITION, 2020, : 191 - 195
  • [18] PEGG-Net: Pixel-Wise Efficient Grasp Generation in Complex Scenes
    Wang, Haozhe
    Liu, Zhiyang
    Zhou, Lei
    Yin, Huan
    Ang, Marcelo H., Jr.
    2024 IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, CIS AND IEEE INTERNATIONAL CONFERENCE ON ROBOTICS, AUTOMATION AND MECHATRONICS, RAM, CIS-RAM 2024, 2024, : 199 - 206
  • [19] An Efficient Fragile Watermarking Scheme for Pixel-wise Tamper Detection
    Lee, Yang-Kuo
    Chang, Jen-Chun
    Wu, Hsin-Lung
    Chen, Rong-Jaye
    2012 SIXTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTING (ICGEC), 2012, : 149 - 152
  • [20] Towards Unconstrained Facial Landmark Detection Robust to Diverse Cropping Manners
    Zou, Xu
    Xiao, Peng
    Wang, Jianhui
    Yan, Luxin
    Zhong, Sheng
    Wu, Ying
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (05) : 2070 - 2075