A light-weight, efficient, and general cross-modal image fusion network

被引:22
|
作者
Fang, Aiqing [1 ]
Zhao, Xinbo [1 ]
Yang, Jiaqi [1 ]
Qin, Beibei [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Sch Comp Sci, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Deep learning; Collaborative optimization; Image quality; Optimization; INFORMATION; EXTRACTION; FRAMEWORK;
D O I
10.1016/j.neucom.2021.08.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross -modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning character-istics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the differ-ence of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:198 / 211
页数:14
相关论文
共 50 条
  • [1] CMFFN: An efficient cross-modal feature fusion network for semantic
    Zhang, Yingjian
    Li, Ning
    Jiao, Jichao
    Ai, Jiawen
    Yan, Zheng
    Zeng, Yingchao
    Zhang, Tianxiang
    Li, Qian
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2025, 186
  • [2] Efficient multi-level cross-modal fusion and detection network for infrared and visible image
    Gao, Hongwei
    Wang, Yutong
    Sun, Jian
    Jiang, Yueqiu
    Gai, Yonggang
    Yu, Jiahui
    ALEXANDRIA ENGINEERING JOURNAL, 2024, 108 : 306 - 318
  • [3] CMIFDF: A lightweight cross-modal image fusion and weight-sharing object detection network framework
    Zhao, Chunbo
    Mo, Bo
    Zhao, Jie
    Tao, Yimeng
    Zhao, Donghui
    INFRARED PHYSICS & TECHNOLOGY, 2025, 145
  • [4] Fibonet: A Light-weight and Efficient Neural Network for Image Segmentation
    Wu, Ruohao
    Xiao, Xi
    Hu, Guangwu
    Zhao, Hanqing
    Zhang, Han
    Peng, Yongqing
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1345 - 1349
  • [5] Mixed-scale cross-modal fusion network for referring image segmentation
    Pan, Xiong
    Xie, Xuemei
    Yang, Jianxiu
    NEUROCOMPUTING, 2025, 614
  • [6] CCAFusion: Cross-Modal Coordinate Attention Network for Infrared and Visible Image Fusion
    Li, Xiaoling
    Li, Yanfeng
    Chen, Houjin
    Peng, Yahui
    Pan, Pan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 866 - 881
  • [7] Heterogeneous Graph Fusion Network for cross-modal image-text retrieval
    Qin, Xueyang
    Li, Lishuang
    Pang, Guangyao
    Hao, Fei
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [8] DCMFNet: Deep Cross-Modal Fusion Network for Referring Image Segmentation with Iterative Gated Fusion
    Huang, Zhen
    Xue, Mingcheng
    Liu, Yu
    Xu, Kaiping
    Li, Jiangquan
    Yu, Chenyang
    PROCEEDINGS OF THE 50TH GRAPHICS INTERFACE CONFERENCE, GI 2024, 2024,
  • [9] Cross-Modal Transformers for Infrared and Visible Image Fusion
    Park, Seonghyun
    Vien, An Gia
    Lee, Chul
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 770 - 785
  • [10] CrossPredGO: A Novel Light-Weight Cross-Modal Multi-Attention Framework for Protein Function Prediction
    Kumar, Vikash
    Deepak, Akshay
    Ranjan, Ashish
    Prakash, Aravind
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2024, 21 (06) : 1709 - 1720