Exploiting enhanced and robust RGB-D face representation via progressive multi-modal learning

被引:1
|
作者
Zhu, Yizhe [1 ,2 ]
Gao, Jialin [1 ,2 ]
Wu, Tianshu [2 ]
Liu, Qiong [2 ]
Zhou, Xi [2 ]
机构
[1] Shanghai Jiao Tong Univ, Cooperat Medianet Innovat Ctr, Shanghai 200240, Peoples R China
[2] CloudWalk Technol, Shanghai 201203, Peoples R China
关键词
RGB-D face recognition; Multi-modal fusion; Depth enhancement; Multi-head-attention mechanism; Incomplete modal data; ATTENTION;
D O I
10.1016/j.patrec.2022.12.027
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing RGB-based 2D face recognition approaches are sensitive to facial variations, posture, occlusions, and illumination. Current depth-based methods have been proved to alleviate the above sensitivity by introducing geometric information but rely heavily on high-quality depth from high-cost RGB-D cameras. To this end, we propose a Progressive Multi-modal Fusion framework to exploit enhanced and robust face representation for RGB-D facial recognition based on low-cost RGB-D cameras, which also deals with in-complete RGB-D modal data. Due to the defects such as holes caused by low-cost cameras, we first design a depth enhancement module to refine the low-quality depth and correct depth inaccuracies. Then, we extract and aggregate augmented feature maps of RGB and depth modality step-by-step. Subsequently, the masked modeling scheme and iterative inter-modal feature interaction module aim to fully exploit the implicit relations among these two modalities. We perform comprehensive experiments to verify the superior performance and robustness of the proposed solution over other FR approaches on four chal-lenging benchmark databases. (c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:38 / 45
页数:8
相关论文
共 50 条
  • [41] RGB-D Salient Object Detection Method Based on Multi-Modal Fusion and Contour Guidance
    Peng, Yanbin
    Feng, Mingkun
    Zheng, Zhijun
    IEEE ACCESS, 2023, 11 : 145217 - 145230
  • [42] RGB-D Image Saliency Detection Based on Multi-modal Feature-fused Supervision
    Liu Zhengyi
    Duan Quntao
    Shi Song
    Zhao Peng
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2020, 42 (04) : 997 - 1004
  • [43] MIPANet: optimizing RGB-D semantic segmentation through multi-modal interaction and pooling attention
    Zhang, Shuai
    Xie, Minghong
    FRONTIERS IN PHYSICS, 2024, 12
  • [44] RGB-D Face Recognition via Deep Complementary and Common Feature Learning
    Zhang, Hao
    Han, Hu
    Cui, Jiyun
    Shan, Shiguang
    Chen, Xilin
    PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 8 - 15
  • [45] Robust RGB-D face recognition using Kinect sensor
    Li, Billy Y. L.
    Xue, Mingliang
    Mian, Ajmal
    Liu, Wanquan
    Krishna, Aneesh
    NEUROCOMPUTING, 2016, 214 : 93 - 108
  • [46] MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning
    Lu, Xinyu
    Wang, Lifang
    Jiang, Zejun
    He, Shichang
    Liu, Shizhong
    APPLIED INTELLIGENCE, 2022, 52 (07) : 7480 - 7497
  • [47] MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning
    Xinyu Lu
    Lifang Wang
    Zejun Jiang
    Shichang He
    Shizhong Liu
    Applied Intelligence, 2022, 52 : 7480 - 7497
  • [48] RGB-D Object Recognition Using Multi-Modal Deep Neural Network and DS Evidence Theory
    Zeng, Hui
    Yang, Bin
    Wang, Xiuqing
    Liu, Jiwei
    Fu, Dongmei
    SENSORS, 2019, 19 (03)
  • [49] Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion
    Sun, Peng
    Zhang, Wenhu
    Wang, Huanyu
    Li, Songyuan
    Li, Xi
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1407 - 1417
  • [50] Ingredient-guided multi-modal interaction and refinement network for RGB-D food nutrition assessment
    Nian, Fudong
    Hu, Yujie
    Gu, Yanhong
    Wu, Zhize
    Yang, Shimeng
    Shu, Jianhua
    DIGITAL SIGNAL PROCESSING, 2024, 153