TUFusion: A Transformer-Based Universal Fusion Algorithm for Multimodal Images

被引:3
|
作者
Zhao, Yangyang [1 ]
Zheng, Qingchun [2 ,3 ]
Zhu, Peihao [2 ,3 ]
Zhang, Xu [1 ]
Ma, Wenpeng [2 ,3 ]
机构
[1] Tianjin Univ Technol, Sch Comp Sci & Engn, Tianjin 300384, Peoples R China
[2] Tianjin Univ Technol, Sch Mech Engn, Tianjin Key Lab Adv Mechatron Syst Design & Intell, Tianjin 300384, Peoples R China
[3] Tianjin Univ Technol, Natl Demonstrat Ctr Expt Mech & Elect Engn Educ, Tianjin 300384, Peoples R China
基金
中国国家自然科学基金;
关键词
Multimodal fusion; image fusion; transformer; hybrid structure; fusion strategy; GENERATIVE ADVERSARIAL NETWORK; NEST;
D O I
10.1109/TCSVT.2023.3296745
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multimodal image fusion is one of the important research directions in the field of multimodal fusion. This technique can realize image and data enhancement by using complementary multimodal images and be widely used in medicine, industry, security and fire protection, automatic driving and consumer electronics. In this work, we propose a transformer-based universal fusion (TUFusion) algorithm, and it has a multidomain fusion capability. The advantage of TUFusion algorithm is the design of hybrid transformer and convolutional neural network (CNN) encoder structure and a new composite attention fusion strategy, which has the ability of global and local information integration. Compared with the classical state-of-the-art multimodal image fusion methods, the experimental result on multidomain data sets showed that the TUFusion algorithm has certain universality in image fusion. Meanwhile, the TUFusion algorithm we proposed achieves good values on peak signal to noise ratio (PSNR), root mean square error (RMSE) and structural similarity index measure (SSIM). The code of the TUFusion algorithm in this article is available at https://github.com/windrunners/TUFusion.
引用
收藏
页码:1712 / 1725
页数:14
相关论文
共 50 条
  • [1] Transformer-based Multimodal Information Fusion for Facial Expression Analysis
    Zhang, Wei
    Qiu, Feng
    Wang, Suzhen
    Zeng, Hao
    Zhang, Zhimeng
    An, Rudong
    Ma, Bowen
    Ding, Yu
    IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2022, 2022-June : 2427 - 2436
  • [2] Transformer-based Multimodal Information Fusion for Facial Expression Analysis
    Zhang, Wei
    Qiu, Feng
    Wang, Suzhen
    Zeng, Hao
    Zhang, Zhimeng
    An, Rudong
    Ma, Bowen
    Ding, Yu
    arXiv, 2022,
  • [3] Transformer-based Multimodal Information Fusion for Facial Expression Analysis
    Zhang, Wei
    Qiu, Feng
    Wang, Suzhen
    Zeng, Hao
    Zhang, Zhimeng
    An, Rudong
    Ma, Bowen
    Ding, Yu
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2427 - 2436
  • [4] TRANSFORMER-BASED MULTIMODAL FUSION FOR SURVIVAL PREDICTION BY INTEGRATING WHOLE SLIDE IMAGES, CLINICAL, AND GENOMIC DATA
    Chen, Yihang
    Zhao, Weiqin
    Yu, Lequan
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,
  • [5] Transformer-based Algorithm for Commodity Detection in Fisheye Images
    Zhang, Chen
    Yang, Tangwen
    2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 90 - 94
  • [6] Multimodal Emotion Recognition With Transformer-Based Self Supervised Feature Fusion
    Siriwardhana, Shamane
    Kaluarachchi, Tharindu
    Billinghurst, Mark
    Nanayakkara, Suranga
    IEEE ACCESS, 2020, 8 (08): : 176274 - 176285
  • [7] Fusion of Image-text attention for Transformer-based Multimodal Machine Translation
    Ma, Junteng
    Qin, Shihao
    Su, Lan
    Li, Xia
    Xiao, Lixian
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 199 - 204
  • [8] Robust Multimodal Emotion Recognition from Conversation with Transformer-Based Crossmodality Fusion
    Xie, Baijun
    Sidulova, Mariia
    Park, Chung Hyuk
    SENSORS, 2021, 21 (14)
  • [9] Reliable object tracking by multimodal hybrid feature extraction and transformer-based fusion
    Sun, Hongze
    Liu, Rui
    Cai, Wuque
    Wang, Jun
    Wang, Yue
    Tang, Huajin
    Cui, Yan
    Yao, Dezhong
    Guo, Daqing
    NEURAL NETWORKS, 2024, 178
  • [10] Transformer-based models for multimodal irony detection
    Tomás D.
    Ortega-Bueno R.
    Zhang G.
    Rosso P.
    Schifanella R.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (6) : 7399 - 7410