Adaptive RGB Image Recognition by Visual-Depth Embedding

被引：14

作者：

Cai, Ziyun ^{[1
]}

Long, Yang ^{[2
]}

Shao, Ling ^{[3
,4
]}

机构：

[1] Nanjing Univ Posts & Telecommun, Coll Automat, Nanjing, Jiangsu, Peoples R China

[2] Newcastle Univ, Sch Comp, Open Lab, Newcastle Upon Tyne NE4 5TG, Tyne & Wear, England

[3] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates

[4] Univ East Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2018年 / 27卷 / 05期

关键词：

RGB-D data; domain adaptation; visual categorization; NONNEGATIVE MATRIX FACTORIZATION; KERNEL;

D O I：

10.1109/TIP.2018.2806839

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recognizing RGB images from RGB-D data is a promising application, which significantly reduces the cost while can still retain high recognition rates. However, existing methods still suffer from the domain shifting problem due to conventional surveillance cameras and depth sensors are using different mechanisms. In this paper, we aim to simultaneously solve the above two challenges: 1) how to take advantage of the additional depth information in the source domain? 2) how to reduce the data distribution mismatch between the source and target domains? We propose a novel method called adaptive visual-depth embedding (aVDE), which learns the compact shared latent space between two representations of labeled RGB and depth modalities in the source domain first. Then the shared latent space can help the transfer of the depth information to the unlabeled target dataset. At last, aVDE models two separate learning strategies for domain adaptation (feature matching and instance reweighting) in a unified optimization problem, which matches features and reweights instances jointly across the shared latent space and the projected target domain for an adaptive classifier. We test our method on five pairs of data sets for object recognition and scene classification, the results of which demonstrates the effectiveness of our proposed method.

引用

页码：2471 / 2483

页数：13

共 50 条

[31] Combing RGB and Depth Map Features for Human Activity Recognition
Zhao, Yang
Liu, Zicheng
Yang, Lu
Cheng, Hong
2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[32] Visual Recognition in RGB Images and Videos by Learning from RGB-D Data
Li, Wen
Chen, Lin
Xu, Dong
Van Gool, Luc
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (08) : 2030 - 2036
[33] Deep Depth Completion of a Single RGB-D Image
Zhang, Yinda
Funkhouser, Thomas
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 175 - 185
[34] Development of integral photography image with RGB-Depth camera
Yano, Sumio
Lee, Hyoung
Park, Min-Chul
Son, Jung Young
FOURTEENTH INTERNATIONAL CONFERENCE ON CORRELATION OPTICS, 2020, 11369
[35] Face recognition using separate layers of the RGB image
Bours, Patrick
Helkala, Kirsi
2008 FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING, PROCEEDINGS, 2008, : 1035 - 1042
[36] Depth Image Rectification Based on an Effective RGB-Depth Boundary Inconsistency Model
Cao, Hao
Zhao, Xin
Li, Ang
Yang, Meng
ELECTRONICS, 2024, 13 (16)
[37] RGB-Z: Mapping a sparse depth map to a high resolution RGB camera image
Rafii, A
Rossbach, C
Zhao, P
2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol 2, Proceedings, 2005, : 1210 - 1210
[38] FloW Vision: Depth Image Enhancement by Combining Stereo RGB-Depth Sensor
Waskitho, Suryo Aji
Alfarouq, Ardiansyah
Sukaridhoto, Sritrusta
Pramadihanto, Dadet
2016 INTERNATIONAL CONFERENCE ON KNOWLEDGE CREATION AND INTELLIGENT COMPUTING (KCIC), 2016, : 182 - 187
[39] An embedding method in image based on visual redundancy
Xiaoyan, Qiao
Ji, Guangong
Liang, Hui
2007 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION AND LOGISTICS, VOLS 1-6, 2007, : 2969 - +
[40] Transparent Embedding Space for Interpretable Image Recognition
Wang, Jiaqi
Liu, Huafeng
Jing, Liping
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3204 - 3219

← 1 2 3 4 5 →