A light-weight, efficient, and general cross-modal image fusion network

被引:22
|
作者
Fang, Aiqing [1 ]
Zhao, Xinbo [1 ]
Yang, Jiaqi [1 ]
Qin, Beibei [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Sch Comp Sci, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Deep learning; Collaborative optimization; Image quality; Optimization; INFORMATION; EXTRACTION; FRAMEWORK;
D O I
10.1016/j.neucom.2021.08.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross -modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning character-istics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the differ-ence of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:198 / 211
页数:14
相关论文
共 50 条
  • [41] Cross-modal Graph Matching Network for Image-text Retrieval
    Cheng, Yuhao
    Zhu, Xiaoguang
    Qian, Jiuchao
    Wen, Fei
    Liu, Peilin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (04)
  • [42] Cross-Modal Collaborative Evolution Reinforced by Semantic Coupling for Image Registration and Fusion
    Xiong, Yan
    Kong, Jun
    Zhang, Yunde
    Lu, Ming
    Jiang, Min
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2025, 74
  • [43] Semantic-Enhanced Cross-Modal Fusion for Improved Unsupervised Image Captioning
    Xiang, Nan
    Chen, Ling
    Liang, Leiyan
    Rao, Xingdi
    Gong, Zehao
    ELECTRONICS, 2023, 12 (17)
  • [44] Cross-modal fusion for multi-label image classification with attention mechanism
    Wang, Yangtao
    Xie, Yanzhao
    Zeng, Jiangfeng
    Wang, Hanpin
    Fan, Lisheng
    Song, Yufan
    Computers and Electrical Engineering, 2022, 101
  • [45] Cross-modal fusion for multi-label image classification with attention mechanism
    Wang, Yangtao
    Xie, Yanzhao
    Zeng, Jiangfeng
    Wang, Hanpin
    Fan, Lisheng
    Song, Yufan
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 101
  • [46] CMAF: Cross-modal Augmentation via Fusion for Underwater Acoustic Image Recognition
    Yang, Shih-Wei
    Shen, Li-Hsiang
    Shuai, Hong-Han
    Feng, Kai-Ten
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)
  • [47] Fast Graph Convolution Network Based Multi-label Image Recognition via Cross-modal Fusion
    Wang, Yangtao
    Xie, Yanzhao
    Liu, Yu
    Zhou, Ke
    Li, Xiaocui
    CIKM '20: PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, 2020, : 1575 - 1584
  • [48] Cross-modal discriminant adversarial network
    Hu, Peng
    Peng, Xi
    Zhu, Hongyuan
    Lin, Jie
    Zhen, Liangli
    Wang, Wei
    Peng, Dezhong
    PATTERN RECOGNITION, 2021, 112
  • [49] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    NEUROCOMPUTING, 2023, 548
  • [50] Emotional computing based on cross-modal fusion and edge network data incentive
    Ma, Lei
    Ju, Feng
    Wan, Jing
    Shen, Xiaoyan
    PERSONAL AND UBIQUITOUS COMPUTING, 2019, 23 (3-4) : 363 - 372