Multi-modal degradation feature learning for unified image restoration based on contrastive learning

被引:0
|
作者
Chen, Lei [1 ]
Xiong, Qingbo [1 ]
Zhang, Wei [1 ,2 ]
Liang, Xiaoli [1 ]
Gan, Zhihua [1 ]
Li, Liqiang [3 ]
He, Xin [1 ]
机构
[1] Henan Univ, Sch Software, Jinming Rd, Kaifeng 475004, Peoples R China
[2] China Univ Labor Relat, Sch Appl Technol, Zengguang Rd, Beijing 100048, Peoples R China
[3] Shangqiu Normal Univ, Sch Phys, Shangqiu 476000, Peoples R China
基金
美国国家科学基金会;
关键词
Unified image restoration; Multi-modal features; Contrastive learning; Deep learning;
D O I
10.1016/j.neucom.2024.128955
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the unified image restoration challenge by reframing it as a contrastive learning- based classification problem. Despite the significant strides made by deep learning methods in enhancing image restoration quality, their limited capacity to generalize across diverse degradation types and intensities necessitates the training of separate models for each specific degradation scenario. We proposes an all- encompassing approach that can restore images from various unknown corruption types and levels. We devise a method that learns representations of the latent sharp image's degradation and accompanying textual features (such as dataset categories and image content descriptions), converting these into prompts which are then embedded within a reconstruction network model to enhance cross-database restoration performance. This culminates in a unified image reconstruction framework. The study involves two stages: In the first stage, we design a MultiContentNet that learns multi-modal features (MMFs) of the latent sharp image. This network encodes the visual degradation expressions and contextual text features into latent variables, thereby exerting a guided classification effect. Specifically, MultiContentNet is trained as an auxiliary controller capable of taking the degraded input image and, through contrastive learning, extracts MMFs of the latent target image. This effectively generates natural classifiers tailored for different degradation types. The second phase integrates the learned MMFs into an image restoration network via cross-attention mechanisms. This guides the restoration model to learn high-fidelity image recovery. Experiments conducted on six blind image restoration tasks demonstrate that the proposed method achieves state-of-the-art performance, highlighting the potential significance of large-scale pretrained vision-language models' MMFs in advancing high-quality unified image reconstruction.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Collaborative denoised graph contrastive learning for multi-modal recommendation
    Xu, Fuyong
    Zhu, Zhenfang
    Fu, Yixin
    Wang, Ru
    Liu, Peiyu
    INFORMATION SCIENCES, 2024, 679
  • [22] MULTI-MODAL IMAGE PROCESSING BASED ON COUPLED DICTIONARY LEARNING
    Song, Pingfan
    Rodrigues, Miguel R. D.
    2018 IEEE 19TH INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING ADVANCES IN WIRELESS COMMUNICATIONS (SPAWC), 2018, : 356 - 360
  • [23] Deep Feature Correlation Learning for Multi-Modal Remote Sensing Image Registration
    Quan, Dou
    Wang, Shuang
    Gu, Yu
    Lei, Ruiqi
    Yang, Bowu
    Wei, Shaowei
    Hou, Biao
    Jiao, Licheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [24] Multi-modal Learning for Social Image Classification
    Liu, Chunyang
    Zhang, Xu
    Li, Xiong
    Li, Rui
    Zhang, Xiaoming
    Chao, Wenhan
    2016 12TH INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (ICNC-FSKD), 2016, : 1174 - 1179
  • [25] CrossCLR: Cross-modal Contrastive Learning For Multi-modal Video Representations
    Zolfaghari, Mohammadreza
    Zhu, Yi
    Gehler, Peter
    Brox, Thomas
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 1430 - 1439
  • [26] Citrus Huanglongbing Detection Based on Multi-Modal Feature Fusion Learning
    Yang, Dongzi
    Wang, Fengcheng
    Hu, Yuqi
    Lan, Yubin
    Deng, Xiaoling
    FRONTIERS IN PLANT SCIENCE, 2021, 12
  • [27] Unified feature extraction framework based on contrastive learning
    Zhang, Hongjie
    Qiang, Wenwen
    Zhang, Jinxin
    Chen, Yingyi
    Jing, Ling
    KNOWLEDGE-BASED SYSTEMS, 2022, 258
  • [28] Towards Accurate and Robust Multi-Modal Medical Image Registration Using Contrastive Metric Learning
    Hu, Jinrong
    Sun, Shanhui
    Yang, Xiaodong
    Zhou, Shuang
    Wang, Xin
    Fu, Ying
    Zhou, Jiliu
    Yin, Youbing
    Cao, Kunlin
    Song, Qi
    Wu, Xi
    IEEE ACCESS, 2019, 7 : 132816 - 132827
  • [29] Nodule-CLIP: Lung nodule classification based on multi-modal contrastive learning
    Sun L.
    Zhang M.
    Lu Y.
    Zhu W.
    Yi Y.
    Yan F.
    Computers in Biology and Medicine, 175
  • [30] Multi-Modal Transportation Recommendation with Unified Route Representation Learning
    Liu, Hao
    Han, Jindong
    Fu, Yanjie
    Zhou, Jingbo
    Lu, Xinjiang
    Xiong, Hui
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (03): : 342 - 350