Multi-scale inputs and context-aware aggregation network for stereo matching

被引:1
|
作者
Shi, Liqing [1 ,2 ,3 ]
Xiong, Taiping [1 ,2 ]
Cui, Gengshen [2 ]
Pan, Minghua [2 ]
Cheng, Nuo [1 ,2 ]
Wu, Xiangjie [1 ,2 ]
机构
[1] Guilin Univ Elect Technol, Guangxi Key Lab Image & Graph Intelligent Proc, Guilin 541004, Peoples R China
[2] Guilin Univ Elect Technol, Sch Comp Sci & Informat Secur, Guilin 541004, Peoples R China
[3] Guilin Univ Elect Technol, Nanning Res Inst, Nanning 530000, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-scale feature fusion; Context-aware capability; 3D squeeze-and-excitation; Stereo matching; Binocular vision;
D O I
10.1007/s11042-024-18492-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite the significant progress made in deep learning-based stereo matching, the accuracy of these methods significantly decreases when faced with challenges such as occlusions, reflections, textureless areas, and scale variations. In this paper, we propose MSCANet, a novel stereo matching network that integrates multi-scale inputs and context-aware aggregation ability. MSCANet effectively integrates rich multi-scale feature information and exhibits context-aware capability, thereby enabling it to achieve superior performance. Firstly, a multi-scale aware fusion module is designed to efficiently incorporate more comprehensive global context features at different scales, which allows the model to enhance its ability to generalize across images of varying scales. Secondly, a novel V-shaped encoder/decoder module is developed to effectively exploit the rich feature information. In the encoding stage, a 3D squeeze-and-excitation block is introduced to facilitate adaptively recalibration of learned feature maps. This block effectively suppresses irrelevant features while enhancing useful features, which improved efficiency and accuracy in disparity prediction. Additionally, a 3D context-aware decode block is designed to effectively utilize global context features to restore the original image structure during the decoding stage. Moreover, the high-level feature maps can be employed to augment low-level feature maps by incorporating more detailed information to avoid the side effects caused by the loss of information during the encoding process. Extensive ablation experiments and comparative experiments were conducted on Scene Flow dataset, KITTI2012 and KITTI2015 datasets to validate the effectiveness of each proposed module. The experimental results demonstrate MSCANet achieves competitive performance and offers a more straightforward and efficient model design, as well as faster inference speed.
引用
收藏
页码:75171 / 75194
页数:24
相关论文
共 50 条
  • [1] Context-Aware Multi-Scale Aggregation Network for Congested Crowd Counting
    Huang, Liangjun
    Shen, Shihui
    Zhu, Luning
    Shi, Qingxuan
    Zhang, Jianwei
    SENSORS, 2022, 22 (09)
  • [2] Multi-Scale Context Attention Network for Stereo Matching
    Sang, Haiwei
    Wang, Quanhong
    Zhao, Yong
    IEEE ACCESS, 2019, 7 : 15152 - 15161
  • [3] MPANET: MULTI-SCALE PYRAMID AGGREGATION NETWORK FOR STEREO MATCHING
    Zhu, Ziyu
    Guo, Wei
    Chen, Wei
    Li, Qiuping
    Zhao, Yong
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2773 - 2777
  • [4] Multi-scale Fusion with Context-aware Network for Object Detection
    Wang, Hanyuan
    Xu, Jie
    Li, Linke
    Tian, Ye
    Xu, Du
    Xu, Shizhong
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2486 - 2491
  • [5] Self-adaptive Multi-scale Aggregation Network for Stereo Matching
    Li, Pengfei
    Ye, Shuiqiang
    Zhang, Jiaquan
    Wang Xinan
    Dai, Qifei
    Yu, Zhengzhong
    Li, Fuchi
    Zhao, Yong
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 3794 - 3800
  • [6] Multi-scale context-aware network for continuous sign language recognition
    Senhua XUE
    Liqing GAO
    Liang WAN
    Wei FENG
    虚拟现实与智能硬件(中英文), 2024, 6 (04) : 323 - 337
  • [7] Multi-scale context-aware network for continuous sign language recognition
    XUE, Senhua
    GAO, Liqing
    WAN, Liang
    FENG, Wei
    Virtual Reality and Intelligent Hardware, 2024, 6 (04): : 323 - 337
  • [8] MSCANet: A multi-scale context-aware network for remote sensing object detection
    Zhou, Huaping
    Liu, Weidong
    Sun, Kelei
    Wu, Jin
    Wu, Tao
    EARTH SCIENCE INFORMATICS, 2024, 17 (06) : 5521 - 5538
  • [9] CONTEXT-AWARE HIERARCHICAL FEATURE ATTENTION NETWORK FOR MULTI-SCALE OBJECT DETECTION
    Xu, Xuelong
    Luo, Xiangfeng
    Ma, Liyan
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2011 - 2015
  • [10] Multi-Scale Aggregation Stereo Matching Network Based on Dense Grouping Atrous Convolution
    Zou, Qijie
    Zhang, Jie
    Chen, Shuang
    Gao, Bing
    Qin, Jing
    Dong, Aotian
    APPLIED SCIENCES-BASEL, 2023, 13 (12):