Multi-Modal Convolutional Dictionary Learning

被引：30

作者：

Gao, Fangyuan ^{[1
]}

Deng, Xin ^{[1
]}

Xu, Mai ^{[2
]}

Xu, Jingyi ^{[2
]}

Dragotti, Pier Luigi ^{[3
]}

机构：

[1] Beihang Univ, Sch Cyber Sci & Technol, Beijing 100191, Peoples R China

[2] Beihang Univ, Dept Elect Informat Engn, Beijing 100191, Peoples R China

[3] Imperial Coll London, Dept Elect & Elect Engn, London SW7 2AZ, England

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2022年 / 31卷

基金：

北京市自然科学基金;

关键词：

Dictionaries; Training; Memory management; Noise level; Toy manufacturing industry; Standards; Paints; Multi-modal dictionary learning; convolutional sparse coding; image denoising; IMAGE SUPERRESOLUTION; LOW-RANK; SPARSE; TRANSFORM;

D O I：

10.1109/TIP.2022.3141251

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Convolutional dictionary learning has become increasingly popular in signal and image processing for its ability to overcome the limitations of traditional patch-based dictionary learning. Although most studies on convolutional dictionary learning mainly focus on the unimodal case, real-world image processing tasks usually involve images from multiple modalities, e.g., visible and near-infrared (NIR) images. Thus, it is necessary to explore convolutional dictionary learning across different modalities. In this paper, we propose a novel multi-modal convolutional dictionary learning algorithm, which efficiently correlates different image modalities and fully considers neighborhood information at the image level. In this model, each modality is represented by two convolutional dictionaries, in which one dictionary is for common feature representation and the other is for unique feature representation. The model is constrained by the requirement that the convolutional sparse representations (CSRs) for the common features should be the same across different modalities, considering that these images are captured from the same scene. We propose a new training method based on the alternating direction method of multipliers (ADMM) to alternatively learn the common and unique dictionaries in the discrete Fourier transform (DFT) domain. We show that our model converges in less than 20 iterations between the convolutional dictionary updating and the CSRs calculation. The effectiveness of the proposed dictionary learning algorithm is demonstrated on various multimodal image processing tasks, achieves better performance than both dictionary learning methods and deep learning based methods with limited training data.

引用

页码：1325 / 1339

页数：15

共 50 条

[41] Multi-Modal Curriculum Learning over Graphs
Gong, Chen
Yang, Jian
Tao, Dacheng
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2019, 10 (04)
[42] Multi-Modal Learning for Predicting the Genotype of Glioma
Wei, Yiran
Chen, Xi
Zhu, Lei
Zhang, Lipei
Schonlieb, Carola-Bibiane
Price, Stephen
Li, Chao
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (11) : 3167 - 3178
[43] Scalable multi-modal representation learning networks
Zihan Fang
Ying Zou
Shiyang Lan
Shide Du
Yanchao Tan
Shiping Wang
Artificial Intelligence Review, 58 (7)
[44] Learning to Hash on Partial Multi-Modal Data
Wang, Qifan
Si, Luo
Shen, Bin
PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 3904 - 3910
[45] Multi-Modal Graph Learning for Disease Prediction
Zheng, Shuai
Zhu, Zhenfeng
Liu, Zhizhe
Guo, Zhenyu
Liu, Yang
Yang, Yuchen
Zhao, Yao
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2022, 41 (09) : 2207 - 2216
[46] Multi-Modal Teaching Design in Learning Poetry
Sun Nan
PROCEEDINGS OF 2018 INTERNATIONAL SYMPOSIUM - REFORM AND INNOVATION OF HIGHER ENGINEERING EDUCATION, 2018, : 191 - 194
[47] Multi-modal broad learning for material recognition
Wang, Zhaoxin
Liu, Huaping
Xu, Xinying
Sun, Fuchun
COGNITIVE COMPUTATION AND SYSTEMS, 2021, 3 (02) : 123 - 130
[48] Multi-modal learning for geospatial vegetation forecasting
Benson, Vitus
Robin, Claire
Requena-Mesa, Christian
Alonso, Lazaro
Carvalhais, Nuno
Cortes, Jose
Gao, Zhihan
Linscheid, Nora
Weynants, Melanie
Reichstein, Markus
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 27788 - 27799
[49] Knowledge Synergy Learning for Multi-Modal Tracking
He, Yuhang
Ma, Zhiheng
Wei, Xing
Gong, Yihong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (07) : 5519 - 5532
[50] Multi-modal Learning for WebAssembly Reverse Engineering
Huang, Hanxian
Zhao, Jishen
PROCEEDINGS OF THE 33RD ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2024, 2024, : 453 - 465

← 1 2 3 4 5 →