Multi-Modal Convolutional Dictionary Learning

被引:30
|
作者
Gao, Fangyuan [1 ]
Deng, Xin [1 ]
Xu, Mai [2 ]
Xu, Jingyi [2 ]
Dragotti, Pier Luigi [3 ]
机构
[1] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China
[2] Beihang Univ, Dept Elect Informat Engn, Beijing 100191, Peoples R China
[3] Imperial Coll London, Dept Elect & Elect Engn, London SW7 2AZ, England
基金
北京市自然科学基金;
关键词
Dictionaries; Training; Memory management; Noise level; Toy manufacturing industry; Standards; Paints; Multi-modal dictionary learning; convolutional sparse coding; image denoising; IMAGE SUPERRESOLUTION; LOW-RANK; SPARSE; TRANSFORM;
D O I
10.1109/TIP.2022.3141251
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional dictionary learning has become increasingly popular in signal and image processing for its ability to overcome the limitations of traditional patch-based dictionary learning. Although most studies on convolutional dictionary learning mainly focus on the unimodal case, real-world image processing tasks usually involve images from multiple modalities, e.g., visible and near-infrared (NIR) images. Thus, it is necessary to explore convolutional dictionary learning across different modalities. In this paper, we propose a novel multi-modal convolutional dictionary learning algorithm, which efficiently correlates different image modalities and fully considers neighborhood information at the image level. In this model, each modality is represented by two convolutional dictionaries, in which one dictionary is for common feature representation and the other is for unique feature representation. The model is constrained by the requirement that the convolutional sparse representations (CSRs) for the common features should be the same across different modalities, considering that these images are captured from the same scene. We propose a new training method based on the alternating direction method of multipliers (ADMM) to alternatively learn the common and unique dictionaries in the discrete Fourier transform (DFT) domain. We show that our model converges in less than 20 iterations between the convolutional dictionary updating and the CSRs calculation. The effectiveness of the proposed dictionary learning algorithm is demonstrated on various multimodal image processing tasks, achieves better performance than both dictionary learning methods and deep learning based methods with limited training data.
引用
收藏
页码:1325 / 1339
页数:15
相关论文
共 50 条
  • [21] Multi-Modal Meta Continual Learning
    Gai, Sibo
    Chen, Zhengyu
    Wang, Donglin
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [22] MULTI-MODAL LEARNING FOR GESTURE RECOGNITION
    Cao, Congqi
    Zhang, Yifan
    Lu, Hanqing
    2015 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2015,
  • [23] Learning multi-modal control programs
    Mehta, TR
    Egerstedt, M
    HYBRID SYSTEMS: COMPUTATION AND CONTROL, 2005, 3414 : 466 - 479
  • [24] Imagery in multi-modal object learning
    Jüttner, M
    Rentschler, I
    BEHAVIORAL AND BRAIN SCIENCES, 2002, 25 (02) : 197 - +
  • [25] Multi-modal Network Representation Learning
    Zhang, Chuxu
    Jiang, Meng
    Zhang, Xiangliang
    Ye, Yanfang
    Chawla, Nitesh, V
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3557 - 3558
  • [26] Modelling multi-modal learning in a hawkmoth
    Balkenius, Anna
    Kelber, Almut
    Balkenius, Christian
    FROM ANIMALS TO ANIMATS 9, PROCEEDINGS, 2006, 4095 : 422 - 433
  • [27] MaPLe: Multi-modal Prompt Learning
    Khattak, Muhammad Uzair
    Rasheed, Hanoona
    Maaz, Muhammad
    Khan, Salman
    Khan, Fahad Shahbaz
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19113 - 19122
  • [28] Hierarchical sparse representation with deep dictionary for multi-modal classification
    Wang, Zhengxia
    Teng, Shenghua
    Liu, Guodong
    Zhao, Zengshun
    Wu, Hongli
    NEUROCOMPUTING, 2017, 253 : 65 - 69
  • [29] RetrievalMMT: Retrieval-Constrained Multi-Modal Prompt Learning for Multi-Modal Machine Translation
    Wang, Yan
    Zeng, Yawen
    Liang, Junjie
    Xing, Xiaofen
    Xu, Jin
    Xu, Xiangmin
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 860 - 868
  • [30] Multi-modal Dictionary BERT for Cross-modal Video Search in Baidu Advertising
    Yu, Tan
    Yang, Yi
    Li, Yi
    Liu, Lin
    Sun, Mingming
    Li, Ping
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 4341 - 4351