CDDFuse: Correlation-Driven Dual-Branch Feature Decomposition for Multi-Modality Image Fusion

被引：252

作者：

Zhao, Zixiang ^{[1
,2
]}

Bai, Haowen ^{[1
]}

Zhang, Jiangshe ^{[1
]}

Zhang, Yulun ^{[2
]}

Xu, Shuang ^{[3
,4
]}

Lin, Zudi ^{[5
]}

Timofte, Radu ^{[2
,6
]}

Van Gool, Luc ^{[2
]}

机构：

[1] Xi An Jiao Tong Univ, Xian, Peoples R China

[2] Swiss Fed Inst Technol, Comp Vis Lab, Zurich, Switzerland

[3] Northwestern Polytech Univ Shenzhen, Inst Res & Dev, Shenzhen, Peoples R China

[4] Northwestern Polytech Univ, Xian, Peoples R China

[5] Harvard Univ, Cambridge, England

[6] Univ Wurzburg, Wurzburg, Germany

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年

基金：

中国国家自然科学基金;

关键词：

NETWORK;

D O I：

10.1109/CVPR52729.2023.00572

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-modality (MM) image fusion aims to render fused images that maintain the merits of different modalities, e.g., functional highlight and detailed textures. To tackle the challenge in modeling cross-modality features and decomposing desirable modality-specific and modality-shared features, we propose a novel Correlation-Driven feature Decomposition Fusion (CDDFuse) network. Firstly, CDDFuse uses Restormer blocks to extract cross-modality shallow features. We then introduce a dual-branch Transformer-CNN feature extractor with Lite Transformer (LT) blocks leveraging long-range attention to handle low-frequency global features and Invertible Neural Networks (INN) blocks focusing on extracting high-frequency local information. A correlation-driven loss is further proposed to make the low-frequency features correlated while the high-frequency features un-correlated based on the embedded information. Then, the LT-based global fusion and INN-based local fusion layers output the fused image. Extensive experiments demonstrate that our CDDFuse achieves promising results in multiple fusion tasks, including infrared-visible image fusion and medical image fusion. We also show that CDDFuse can boost the performance in downstream infrared-visible semantic segmentation and object detection in a unified benchmark. The code is available at https://github.com/Zhaozixiang1228/MMIF-CDDFuse.

引用

页码：5906 / 5916

页数：11

共 50 条

[31] Concept-Driven Multi-Modality Fusion for Video Search
Wei, Xiao-Yong
Jiang, Yu-Gang
Ngo, Chong-Wah
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2011, 21 (01) : 62 - 73
[32] DCTNet: A Heterogeneous Dual-Branch Multi-Cascade Network for Infrared and Visible Image Fusion
Li, Jinfu
Liu, Lei
Song, Hong
Huang, Yuqi
Jiang, Junjun
Yang, Jian
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[33] A Bilevel Integrated Model With Data-Driven Layer Ensemble for Multi-Modality Image Fusion
Liu, Risheng
Liu, Jinyuan
Jiang, Zhiying
Fan, Xin
Luo, Zhongxuan
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 1261 - 1274
[34] CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion
Jinyuan Liu
Runjia Lin
Guanyao Wu
Risheng Liu
Zhongxuan Luo
Xin Fan
International Journal of Computer Vision, 2024, 132 : 1748 - 1775
[35] CoCoNet: Coupled Contrastive Learning Network with Multi-level Feature Ensemble for Multi-modality Image Fusion
Liu, Jinyuan
Lin, Runjia
Wu, Guanyao
Liu, Risheng
Luo, Zhongxuan
Fan, Xin
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (05) : 1748 - 1775
[36] Multi-interactive Feature Learning and a Full-time Multi-modality Benchmark for Image Fusion and Segmentation
Liu, Jinyuan
Liu, Zhu
Wu, Guanyao
Ma, Long
Liu, Risheng
Zhong, Wei
Luo, Zhongxuan
Fan, Xin
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8081 - 8090
[37] DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion
Zhao, Zixiang
Bai, Haowen
Zhu, Yuanzhi
Zhang, Jiangshe
Xu, Shuang
Zhang, Yulun
Zhang, Kai
Meng, Deyu
Timofte, Radu
Van Gool, Luc
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8048 - 8059
[38] Classification of hyperspectral image based on dual-branch feature interaction network
Li, Chenming
Wang, Xiangyi
Chen, Zhonghao
Gao, Hongmin
Xu, Shufang
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (09) : 3258 - 3279
[39] Fast saliency-aware multi-modality image fusion
Han, Jungong
Pauwels, Eric J.
de Zeeuw, Paul
NEUROCOMPUTING, 2013, 111 : 70 - 80
[40] Lymphatic flow mapping utilizing multi-modality image fusion
Vicic, M
Thorstad, W
Low, D
Deasy, J
MEDICAL PHYSICS, 2004, 31 (06) : 1900 - 1900

← 1 2 3 4 5 →