Local Selective Vision Transformer for Depth Estimation Using a Compound Eye Camera

被引:5
|
作者
Oh, Wooseok [1 ]
Yoo, Hwiyeon [1 ]
Ha, Taeoh [1 ]
Oh, Songhwai [1 ]
机构
[1] Seoul Natl Univ, ASRI, Dept Elect & Comp Engn, Seoul 08826, South Korea
关键词
Compound Eye; Depth Estimation; Vision Transformer;
D O I
10.1016/j.patrec.2023.02.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A compound eye camera is a hemispherical camera made by mimicking the structure of an insect's eye. In general, a compound eye camera is composed of a set of single eye cameras. The compound eye cam-era has various advantages due to its unique structure and can be used in various vision tasks. In order to apply the compound eye camera to various vision tasks using 3D information, depth estimation is required. However, due to the difference between the compound eye image and the 2D RGB image, it is hard to use the existing depth estimation methods directly. In this paper, we propose a transformer-based neural network for eye-wise depth estimation, which is suitable for the compound eye image. We modify the self-attention module with local selective self-attention to take advantage of the compound eye's hemispherical structure. In addition, we reduce the computational amount and increase the per-formance through the eye selection module. Using the proposed local selective self-attention and eye selection modules, we are able to improve the performance without large-scale pre-training. Compared to the ResNet-based depth estimation network, our method showed 2.8% and 1.4% higher performance on the GAZEBO and Matterport3D datasets, respectively, with 15.3% fewer network parameters.(c) 2023 Elsevier B.V. All rights reserved.
引用
收藏
页码:82 / 89
页数:8
相关论文
共 50 条
  • [41] Lightweight monocular depth estimation using a fusion-improved transformer
    Sui, Xin
    Gao, Song
    Xu, Aigong
    Zhang, Cong
    Wang, Changqiang
    Shi, Zhengxu
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [42] Compound-eye camera using a CMOS image sensor and its applications
    Toyoda, Takashi
    Kyokai Joho Imeji Zasshi/Journal of the Institute of Image Information and Television Engineers, 2009, 63 (03): : 284 - 287
  • [43] DNN Based Camera Attitude Estimation Using Aggregated Information from Camera and Depth Images
    Kawai, Hibiki
    Kuroda, Yoji
    2023 IEEE/SICE INTERNATIONAL SYMPOSIUM ON SYSTEM INTEGRATION, SII, 2023,
  • [44] Vision-Based Interface: Using Face and Eye Blinking Tracking with Camera
    Hao, Zhu
    Lei, Qianwei
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL I, PROCEEDINGS, 2008, : 306 - +
  • [45] Technology for visualizing the local change in shape of edema using a depth camera
    Masui, Kenta
    Kiyomitsu, Kaoru
    Ogawa-Ochiai, Keiko
    Komuro, Takashi
    Tsumura, Norimichi
    ARTIFICIAL LIFE AND ROBOTICS, 2019, 24 (04) : 480 - 486
  • [46] Technology for visualizing the local change in shape of edema using a depth camera
    Kenta Masui
    Kaoru Kiyomitsu
    Keiko Ogawa-Ochiai
    Takashi Komuro
    Norimichi Tsumura
    Artificial Life and Robotics, 2019, 24 : 480 - 486
  • [47] Depth Estimation in Still Images and Videos Using a Motionless Monocular Camera
    Diamantas, Sotirios
    Astaras, Stefanos
    Pnevmatikakis, Aristodemos
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGING SYSTEMS AND TECHNIQUES (IST), 2016, : 129 - 134
  • [48] Deflection Estimation Methods of Structure Using Active Stereo Depth Camera
    Shin, Soojung
    Lee, Donghwan
    Cha, Gichun
    Joon, Yu Byoung
    Park, Seunghee
    JOURNAL OF THE KOREAN SOCIETY FOR NONDESTRUCTIVE TESTING, 2020, 40 (02) : 103 - 111
  • [49] Depth Estimation from a Single Camera Image using Power Fit
    Akhlaq, Muhammad Umair
    Izhar, Umer
    Shahbaz, Umar
    2014 INTERNATIONAL CONFERENCE ON ROBOTICS AND EMERGING ALLIED TECHNOLOGIES IN ENGINEERING (ICREATE), 2014, : 221 - 227
  • [50] FAST RESPONSE AGGREGATION FOR DEPTH ESTIMATION USING LIGHT FIELD CAMERA
    Yang, Cao
    Kang, Kai
    Zhang, Jing
    Wang, Zengfu
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 1636 - 1640