Depth as attention to learn image representations for visual localization, using monocular images

被引:0
|
作者
Hettiarachchi, Dulmini [1 ]
Tian, Ye [1 ]
Yu, Han [2 ]
Kamijo, Shunsuke [3 ]
机构
[1] Univ Tokyo, Grad Sch Interdisciplinary Informat Studies, Tokyo 1130033, Japan
[2] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1130033, Japan
[3] Univ Tokyo, Inst Ind Sci IIS, Tokyo 1538505, Japan
关键词
Image retrieval; Visual localization; Image representation; Depth attention; Global descriptors;
D O I
10.1016/j.jvcir.2023.104012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image retrieval algorithms are widely used in visual localization tasks. In visual localization, we can benefit from retrieving the images depicting same landmark taken from a pose similar to the query. However, state-of-the-art image retrieval algorithms are optimized mainly for landmark retrieval, and do not take camera pose into account. To address this limitation, we propose novel Depth Attention Network (DeAttNet). DeAttNet leverages both visual and depth information in learning a global image representation. Depth varies for similar features captured from different camera poses. Based on this insight, we employ depth within an attention mechanism to discern and emphasize the salient regions. In our method, we utilize monocular depth estimation algorithms to render depth maps. Compared to RGB only image descriptors, significant improvements are obtained with the proposed method on Mapillary Street Level Sequences, Pittsburgh and Cambridge Landmark datasets.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Monocular Image Depth Estimation Based on Multi-Scale Attention Oriented Network
    Liu J.
    Wen J.
    Liang Y.
    [J]. Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2020, 48 (12): : 52 - 62
  • [22] Depth Estimation in Still Images and Videos Using a Motionless Monocular Camera
    Diamantas, Sotirios
    Astaras, Stefanos
    Pnevmatikakis, Aristodemos
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON IMAGING SYSTEMS AND TECHNIQUES (IST), 2016, : 129 - 134
  • [23] Urban Visual Localization of Block-Wise Monocular Images with Google Street Views
    Li, Zhixin
    Li, Shuang
    Anderson, John
    Shan, Jie
    [J]. REMOTE SENSING, 2024, 16 (05)
  • [24] An Efficient Monocular Depth Prediction Network Using Coordinate Attention and Feature Fusion
    Xu, Huihui
    Li, Fei
    [J]. JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2022, 18 (06): : 794 - 802
  • [25] Unsupervised Monocular Depth Estimation Using Attention and Multi-Warp Reconstruction
    Ling, Chuanwu
    Zhang, Xiaogang
    Chen, Hua
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2938 - 2949
  • [26] Underwater Depth Estimation Based on Water Classification using Monocular Image
    Vaz Jr, Edwilson Silva
    de Toledo, Everson Fagundes
    Drews, Paulo L. J.
    [J]. 2020 XVIII LATIN AMERICAN ROBOTICS SYMPOSIUM, 2020 XII BRAZILIAN SYMPOSIUM ON ROBOTICS AND 2020 XI WORKSHOP OF ROBOTICS IN EDUCATION (LARS-SBR-WRE 2020), 2020, : 204 - 209
  • [27] Monocular Image Depth Estimation Using a Conditional Generative Adversarial Net
    Zhang, Xiaofeng
    Chen, Shuo
    Xu, Qingyang
    Zhang, Xiaoxue
    [J]. 2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9176 - 9180
  • [28] Depth Estimation from Monocular Vision using Image Edge Complexity
    Haris, Sallehuddin Mohamed
    Zakaria, Muhammad Khalid
    Nuawi, Mohd Zaki
    [J]. 2011 IEEE/ASME INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT MECHATRONICS (AIM), 2011, : 868 - 873
  • [29] Car depth estimation within a monocular image using a light CNN
    Tighkhorshid, Amirhossein
    Tousi, Seyed Mohamad Ali
    Nikoofard, Amirhossein
    [J]. JOURNAL OF SUPERCOMPUTING, 2023, 79 (16): : 17944 - 17961
  • [30] Car depth estimation within a monocular image using a light CNN
    Amirhossein Tighkhorshid
    Seyed Mohamad Ali Tousi
    Amirhossein Nikoofard
    [J]. The Journal of Supercomputing, 2023, 79 : 17944 - 17961