A light-weight, efficient, and general cross-modal image fusion network

被引:22
|
作者
Fang, Aiqing [1 ]
Zhao, Xinbo [1 ]
Yang, Jiaqi [1 ]
Qin, Beibei [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Sch Comp Sci, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Deep learning; Collaborative optimization; Image quality; Optimization; INFORMATION; EXTRACTION; FRAMEWORK;
D O I
10.1016/j.neucom.2021.08.044
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing cross-modal image fusion methods pay limited research attention to image fusion efficiency and network architecture. However, the efficiency and accuracy of image fusion have an important impact on practical applications. To solve this problem, we propose a light-weight, efficient, and general cross -modal image fusion network, termed as AE-Netv2. Firstly, we analyze the influence of different network architectures (e.g., group convolution, depth-wise convolution, inceptionNet, squeezeNet, shuffleNet, and multi-scale module) on image fusion quality and efficiency, which provides a reference for the design of image fusion architecture. Secondly, we explore the commonness and characteristics of different image fusion tasks, which provides a research basis for further research on the continuous learning character-istics of the human brain. Finally, positive sample loss is added to the similarity loss to reduce the differ-ence of data distribution of different cross-modal image fusion tasks. Comprehensive experiments demonstrate the superiority of our method compared to state-of-the-art methods in different fusion tasks at a real-time speed of 100+ FPS on GTX 2070. Compared with the fastest image fusion method based on deep learning, the efficiency of AE-Netv2 is improved by 2.14 times. Compared with the image fusion model with the smallest model size, the size of our model is reduced by 11.59 times. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:198 / 211
页数:14
相关论文
共 50 条
  • [21] Cross-Modal Hybrid Feature Fusion for Image-Sentence Matching
    Xu, Xing
    Wang, Yifan
    He, Yixuan
    Yang, Yang
    Hanjalic, Alan
    Shen, Heng Tao
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (04)
  • [22] Infrared and visible image fusion based on cross-modal extraction strategy
    Liu, Xiaowen
    Li, Jing
    Yang, Xin
    Huo, Hongtao
    INFRARED PHYSICS & TECHNOLOGY, 2022, 124
  • [23] Heterogeneous Feature Fusion and Cross-modal Alignment for Composed Image Retrieval
    Zhang, Gangjian
    Wei, Shikui
    Pang, Huaxin
    Zhao, Yao
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 5353 - 5362
  • [24] Heterogeneous Attention Network for Effective and Efficient Cross-modal Retrieval
    Yu, Tan
    Yang, Yi
    Li, Yi
    Liu, Lin
    Fei, Hongliang
    Li, Ping
    SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2021, : 1146 - 1156
  • [25] CMFA_Net: A cross-modal feature aggregation network for infrared-visible image fusion
    Ding, Zhaisheng
    Li, Haiyan
    Zhou, Dongming
    Li, Hongsong
    Liu, Yanyu
    Hou, Ruichao
    INFRARED PHYSICS & TECHNOLOGY, 2021, 118
  • [26] BCMFIFuse: A Bilateral Cross-Modal Feature Interaction-Based Network for Infrared and Visible Image Fusion
    Gao, Xueyan
    Liu, Shiguang
    REMOTE SENSING, 2024, 16 (17)
  • [27] ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network
    Mehta, Sachin
    Rastegari, Mohammad
    Shapiro, Linda
    Hajishirzi, Hannaneh
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9182 - 9192
  • [28] Progressive fusion of local and global image features for cross-modal image aesthetic assessment
    Niu, Yuzhen
    Chen, Siling
    Chen, Shanshan
    Li, Fusheng
    MULTIMEDIA SYSTEMS, 2025, 31 (02)
  • [29] Cross-modal complementary network with hierarchical fusion for multimodal sentiment classification
    Peng, Cheng
    Zhang, Chunxia
    Xue, Xiaojun
    Gao, Jiameng
    Liang, Hongjian
    Niu, Zhengdong
    TSINGHUA SCIENCE AND TECHNOLOGY, 2022, 27 (04) : 664 - 679
  • [30] Semantic Preservation and Hash Fusion Network for Unsupervised Cross-Modal Retrieval
    Shu, Xinsheng
    Li, Mingyong
    WEB AND BIG DATA, APWEB-WAIM 2024, PT V, 2024, 14965 : 146 - 161