Exploiting enhanced and robust RGB-D face representation via progressive multi-modal learning

被引:1
|
作者
Zhu, Yizhe [1 ,2 ]
Gao, Jialin [1 ,2 ]
Wu, Tianshu [2 ]
Liu, Qiong [2 ]
Zhou, Xi [2 ]
机构
[1] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr, Shanghai 200240, Peoples R China
[2] CloudWalk Technol, Shanghai 201203, Peoples R China
关键词
RGB-D face recognition; Multi-modal fusion; Depth enhancement; Multi-head-attention mechanism; Incomplete modal data; ATTENTION;
D O I
10.1016/j.patrec.2022.12.027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing RGB-based 2D face recognition approaches are sensitive to facial variations, posture, occlusions, and illumination. Current depth-based methods have been proved to alleviate the above sensitivity by introducing geometric information but rely heavily on high-quality depth from high-cost RGB-D cameras. To this end, we propose a Progressive Multi-modal Fusion framework to exploit enhanced and robust face representation for RGB-D facial recognition based on low-cost RGB-D cameras, which also deals with in-complete RGB-D modal data. Due to the defects such as holes caused by low-cost cameras, we first design a depth enhancement module to refine the low-quality depth and correct depth inaccuracies. Then, we extract and aggregate augmented feature maps of RGB and depth modality step-by-step. Subsequently, the masked modeling scheme and iterative inter-modal feature interaction module aim to fully exploit the implicit relations among these two modalities. We perform comprehensive experiments to verify the superior performance and robustness of the proposed solution over other FR approaches on four chal-lenging benchmark databases. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:38 / 45
页数:8
相关论文
共 50 条
  • [1] RGB-D BASED MULTI-MODAL DEEP LEARNING FOR FACE IDENTIFICATION
    Lin, Tzu-Ying
    Chiu, Ching-Te
    Tang, Ching-Tung
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1668 - 1672
  • [2] RGB-D Scene Classification via Multi-modal Feature Learning
    Ziyun Cai
    Ling Shao
    Cognitive Computation, 2019, 11 : 825 - 840
  • [3] RGB-D Scene Classification via Multi-modal Feature Learning
    Cai, Ziyun
    Shao, Ling
    COGNITIVE COMPUTATION, 2019, 11 (06) : 825 - 840
  • [4] Exploiting Multi-modal Fusion for Robust Face Representation Learning with Missing Modality
    Zhu, Yizhe
    Sun, Xin
    Zhou, Xi
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 283 - 294
  • [5] A Multi-Modal RGB-D Object Recognizer
    Faeulhammer, Thomas
    Zillich, Michael
    Prankl, Johann
    Vincze, Markus
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 733 - 738
  • [6] Multi-modal deep feature learning for RGB-D object detection
    Xu, Xiangyang
    Li, Yuncheng
    Wu, Gangshan
    Luo, Jiebo
    PATTERN RECOGNITION, 2017, 72 : 300 - 313
  • [7] Multi-modal Unsupervised Feature Learning for RGB-D Scene Labeling
    Wang, Anran
    Lu, Jiwen
    Wang, Gang
    Cai, Jianfei
    Cham, Tat-Jen
    COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 : 453 - 467
  • [8] RGB-D Scene Recognition via Spatial-Related Multi-Modal Feature Learning
    Xiong, Zhitong
    Yuan, Yuan
    Wang, Qi
    IEEE ACCESS, 2019, 7 : 106739 - 106747
  • [9] RGB-D based multi-modal deep learning for spacecraft and debris recognition
    AlDahoul, Nouar
    Karim, Hezerul Abdul
    Momo, Mhd Adel
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [10] RGB-D based multi-modal deep learning for spacecraft and debris recognition
    Nouar AlDahoul
    Hezerul Abdul Karim
    Mhd Adel Momo
    Scientific Reports, 12