CGFNet: 3D Convolution Guided and Multi-scale Volume Fusion Network for fast and robust stereo matching

被引:5
|
作者
Wang, Qingyu [1 ]
Xing, Hao [2 ]
Ying, Yibin [1 ]
Zhou, Mingchuan [1 ]
机构
[1] Zhejiang Univ, Coll Biosyst Engn & Food Sci, Yuhangtang Rd 866, Hangzhou 310058, Zhejiang, Peoples R China
[2] Tech Univ Munich, Dept Comp Sci, Machine Vis & Percept Grp, Arcisstr 21, D-80333 Munich, Bayern, Germany
关键词
Robotic vision; Stereo matching; Disparity estimation; Deep learning; Textureless regions; NET;
D O I
10.1016/j.patrec.2023.07.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, although significant progress has been made by convolutional neural network, it is still difficult to realize accurate and robust stereo matching in real time. In this article, we study how to achieve more accurate and robust disparity estimation based on real-time requirement. For this reason, a Multi-scale Volume Fusion (MVF) module was proposed and embedded to improve the matching accuracy. To achieve real-time performance, an innovative way to use 3D convolution is proposed. The 3D convolution is used during training for guidance and supervision, making the inference lightweight. Based on these two structures, we designed an end-to-end stereo matching method called 3D Convolution Guided and Multi-scale Cost Volume Fusion Network (CGFNet). Experimental results showed that our CGFNet has better generalization performance on cross-domain datasets, which achieves more accurate disparity estimation without additional fine tuning process in challenging regions. On KITTI benchmark, CGFNet reached D1-all=1.98% with substantial improvement among the State-Of-The-Art (SOTA) real-time models and runs a pair of images within 38 ms (26 fps). The results are notable when considering both matching accuracy and real-time performance.
引用
收藏
页码:38 / 44
页数:7
相关论文
共 50 条
  • [1] Multi-Scale Cost Attention and Adaptive Fusion Stereo Matching Network
    Liu, Zhenguo
    Li, Zhao
    Ao, Wengang
    Zhang, Shaoshuang
    Liu, Wenlong
    He, Yizhi
    ELECTRONICS, 2023, 12 (07)
  • [2] Multi-Scale Aggregation Stereo Matching Network Based on Dense Grouping Atrous Convolution
    Zou, Qijie
    Zhang, Jie
    Chen, Shuang
    Gao, Bing
    Qin, Jing
    Dong, Aotian
    APPLIED SCIENCES-BASEL, 2023, 13 (12):
  • [3] 3D LiDAR and Stereo Fusion using Stereo Matching Network with Conditional Cost Volume Normalization
    Wang, Tsun-Hsuan
    Hu, Hou-Ning
    Lin, Chieh Hubert
    Tsai, Yi-Hsuan
    Chiu, Wei-Chen
    Sun, Min
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 5895 - 5902
  • [4] A light-weight stereo matching network based on multi-scale features fusion and robust disparity refinement
    Yang, Xiaowei
    Zhao, Yong
    Feng, Zhiguo
    Sang, Haiwei
    Zhang, Zhenbo
    Zhang, Guiying
    He, Lin
    IET IMAGE PROCESSING, 2023, 17 (06) : 1797 - 1811
  • [5] Multi-scale Adaptive Region Matching Network for 3D Reconstruction
    Sun, Jifeng
    Sun, Minghao
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND DIGITAL APPLICATIONS, MIDA2024, 2024, : 127 - 134
  • [6] Multi-Scale Dense Attention Network for Stereo Matching
    Chang, Yuhui
    Xu, Jiangtao
    Gao, Zhiyuan
    ELECTRONICS, 2020, 9 (11) : 1 - 12
  • [7] Multi-Scale Context Attention Network for Stereo Matching
    Sang, Haiwei
    Wang, Quanhong
    Zhao, Yong
    IEEE ACCESS, 2019, 7 : 15152 - 15161
  • [8] Improved stereo matching algorithm based on multi-scale fusion
    Chen, Xing
    Zhang, Wenhai
    Hou, Yu
    Yang, Lin
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2021, 39 (04): : 876 - 882
  • [9] Edge supervision and multi-scale cost volume for stereo matching
    Yang, Xiaowei
    Feng, Zhiguo
    Zhao, Yong
    Zhang, Guiying
    He, Lin
    IMAGE AND VISION COMPUTING, 2022, 117
  • [10] LMNet: A learnable multi-scale cost volume for stereo matching
    Liu, Jiatao
    Zhang, Yaping
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2024, 128