Local Selective Vision Transformer for Depth Estimation Using a Compound Eye Camera

被引:5
|
作者
Oh, Wooseok [1 ]
Yoo, Hwiyeon [1 ]
Ha, Taeoh [1 ]
Oh, Songhwai [1 ]
机构
[1] Seoul Natl Univ, ASRI, Dept Elect & Comp Engn, Seoul 08826, South Korea
关键词
Compound Eye; Depth Estimation; Vision Transformer;
D O I
10.1016/j.patrec.2023.02.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A compound eye camera is a hemispherical camera made by mimicking the structure of an insect's eye. In general, a compound eye camera is composed of a set of single eye cameras. The compound eye cam-era has various advantages due to its unique structure and can be used in various vision tasks. In order to apply the compound eye camera to various vision tasks using 3D information, depth estimation is required. However, due to the difference between the compound eye image and the 2D RGB image, it is hard to use the existing depth estimation methods directly. In this paper, we propose a transformer-based neural network for eye-wise depth estimation, which is suitable for the compound eye image. We modify the self-attention module with local selective self-attention to take advantage of the compound eye's hemispherical structure. In addition, we reduce the computational amount and increase the per-formance through the eye selection module. Using the proposed local selective self-attention and eye selection modules, we are able to improve the performance without large-scale pre-training. Compared to the ResNet-based depth estimation network, our method showed 2.8% and 1.4% higher performance on the GAZEBO and Matterport3D datasets, respectively, with 15.3% fewer network parameters.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:82 / 89
页数:8
相关论文
共 50 条
  • [1] Depth estimation using the compound eye of dipteran flies
    Konstantinos Bitsakos
    Cornelia Fermüller
    Biological Cybernetics, 2006, 95 : 487 - 501
  • [2] Depth estimation using the compound eye of dipteran flies
    Bitsakos, Konstantinos
    Fermueller, Cornelia
    BIOLOGICAL CYBERNETICS, 2006, 95 (05) : 487 - 501
  • [3] DEPTHFORMER: MULTISCALE VISION TRANSFORMER FOR MONOCULAR DEPTH ESTIMATION WITH GLOBAL LOCAL INFORMATION FUSION
    Agarwal, Ashutosh
    Arora, Chetan
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 3873 - 3877
  • [4] Spike Transformer: Monocular Depth Estimation for Spiking Camera
    Zhang, Jiyuan
    Tang, Lulu
    Yu, Zhaofei
    Lu, Jiwen
    Huang, Tiejun
    COMPUTER VISION, ECCV 2022, PT VII, 2022, 13667 : 34 - 52
  • [5] Vision-Based 3D Reconstruction Using a Compound Eye Camera
    Oh, Wooseok
    Yoo, Hwiyeon
    Ha, Timothy
    Oh, Songhwai
    2021 21ST INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2021), 2021, : 418 - 423
  • [6] Ultra-thin Camera - Compound Eye Vision Approach
    Nam, Dongkyung
    Cho, Yang Ho
    Kang, Deokyoung
    Choi, Kyuhwan
    Yoon, Daekun
    Park, Du Sik
    2018 17TH WORKSHOP ON INFORMATION OPTICS (WIO), 2018,
  • [7] Curved fiber compound eye camera inspired by the Strepsiptera vision
    Li, Hanyu
    Zhang, Hongxia
    Liu, Xu
    Jia, Dagong
    Liu, Tiegen
    OPTICS EXPRESS, 2023, 31 (22) : 36903 - 36914
  • [8] Depth Estimation Using a Sliding Camera
    Ge, Kailin
    Hu, Han
    Feng, Jianjiang
    Zhou, Jie
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (02) : 726 - 739
  • [9] METER: A Mobile Vision Transformer Architecture for Monocular Depth Estimation
    Papa, Lorenzo
    Russo, Paolo
    Amerini, Irene
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (10) : 5882 - 5893
  • [10] MobileDepth: Monocular Depth Estimation Based on Lightweight Vision Transformer
    Li, Yundong
    Wei, Xiaokun
    APPLIED ARTIFICIAL INTELLIGENCE, 2024, 38 (01)