Multi-scale inputs and context-aware aggregation network for stereo matching

被引:1
|
作者
Shi, Liqing [1 ,2 ,3 ]
Xiong, Taiping [1 ,2 ]
Cui, Gengshen [2 ]
Pan, Minghua [2 ]
Cheng, Nuo [1 ,2 ]
Wu, Xiangjie [1 ,2 ]
机构
[1] Guilin Univ Elect Technol, Guangxi Key Lab Image & Graph Intelligent Proc, Guilin 541004, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
[3] Guilin Univ Elect Technol, Nanning Res Inst, Nanning 530000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-scale feature fusion; Context-aware capability; 3D squeeze-and-excitation; Stereo matching; Binocular vision;
D O I
10.1007/s11042-024-18492-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the significant progress made in deep learning-based stereo matching, the accuracy of these methods significantly decreases when faced with challenges such as occlusions, reflections, textureless areas, and scale variations. In this paper, we propose MSCANet, a novel stereo matching network that integrates multi-scale inputs and context-aware aggregation ability. MSCANet effectively integrates rich multi-scale feature information and exhibits context-aware capability, thereby enabling it to achieve superior performance. Firstly, a multi-scale aware fusion module is designed to efficiently incorporate more comprehensive global context features at different scales, which allows the model to enhance its ability to generalize across images of varying scales. Secondly, a novel V-shaped encoder/decoder module is developed to effectively exploit the rich feature information. In the encoding stage, a 3D squeeze-and-excitation block is introduced to facilitate adaptively recalibration of learned feature maps. This block effectively suppresses irrelevant features while enhancing useful features, which improved efficiency and accuracy in disparity prediction. Additionally, a 3D context-aware decode block is designed to effectively utilize global context features to restore the original image structure during the decoding stage. Moreover, the high-level feature maps can be employed to augment low-level feature maps by incorporating more detailed information to avoid the side effects caused by the loss of information during the encoding process. Extensive ablation experiments and comparative experiments were conducted on Scene Flow dataset, KITTI2012 and KITTI2015 datasets to validate the effectiveness of each proposed module. The experimental results demonstrate MSCANet achieves competitive performance and offers a more straightforward and efficient model design, as well as faster inference speed.
引用
收藏
页码:75171 / 75194
页数:24
相关论文
共 50 条
  • [31] Anisotropic stereo matching with multi-scale information
    Li Y.
    Wu M.
    Liu K.
    Yu W.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2023, 29 (09): : 2920 - 2928
  • [32] Multi-scale Context-aware User Interest Learning for Behavior Pattern Modeling
    Deng, Zhiying
    Li, Jianjun
    Zou, Li
    Liu, Wei
    Shi, Si
    Chen, Qian
    Zhao, Juan
    Li, Guohui
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2024, PT 3, 2025, 14852 : 333 - 342
  • [33] Multi-Scale Recursive Context Aggregation Network for Semantic Segmentation
    Yalcin, Abdullah
    Keskinoz, Mehmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [34] HIERARCHICAL CONTEXT GUIDED AGGREGATION NETWORK FOR STEREO MATCHING
    Peng, Jun
    Xie, Wangduo
    Huang, Zijing
    Chen, Wei
    Zhao, Yong
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 2115 - 2119
  • [35] Cascaded Multi-scale and Multi-dimension Convolutional Neural Network for Stereo Matching
    Lu, Haihua
    Xu, Hai
    Zhang, Li
    Ma, Yanbo
    Zhao, Yong
    2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [36] Occluded prohibited object detection in X-ray images with global Context-aware Multi-Scale feature Aggregation
    Ma, Chunjie
    Zhuo, Li
    Li, Jiafeng
    Zhang, Yutong
    Zhang, Jing
    NEUROCOMPUTING, 2023, 519 : 1 - 16
  • [37] A BIDIRECTIONAL CONTEXT-AWARE AND MULTI-SCALE FUSION HYBRID NETWORK FOR SHORT-TERM TRAFFIC FLOW PREDICTION
    Chen, Zhixing
    Zhen, Guizhou
    PROMET-TRAFFIC & TRANSPORTATION, 2022, 34 (03): : 407 - 420
  • [38] Abdominal multi-organ segmentation using multi-scale and context-aware neural networks
    Song, Yuhan
    Elibol, Armagan
    Chong, Nak Young
    IFAC JOURNAL OF SYSTEMS AND CONTROL, 2024, 27
  • [39] Multi-Exposure Image Fusion via Multi-Scale and Context-Aware Feature Learning
    Liu, Yu
    Yang, Zhigang
    Cheng, Juan
    Chen, Xun
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 100 - 104
  • [40] Lightweight multi-scale convolutional neural network for real time stereo matching
    Xue, Yanbing
    Zhang, Doudou
    Li, Leida
    Li, Shiyin
    Wang, Yuxin
    IMAGE AND VISION COMPUTING, 2022, 124