Contrastive Language-Image Pre-Training with Knowledge Graphs

被引：0

作者：

Pan, Xuran ^{[1
]}

Ye, Tianzhu ^{[1
]}

Han, Dongchen ^{[1
]}

Song, Shiji ^{[1
]}

Huang, Gao ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Automat, BNRist, Beijing, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022) | 2022年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recent years have witnessed the fast development of large-scale pre-training frameworks that can extract multi-modal representations in a unified form and achieve promising performances when transferred to downstream tasks. Nevertheless, existing approaches mainly focus on pre-training with simple image-text pairs, while neglecting the semantic connections between concepts from different modalities. In this paper, we propose a knowledge-based pre-training framework, dubbed Knowledge-CLIP, which injects semantic information into the widely used CLIP model [38]. Through introducing knowledge-based objectives in the pre-training process and utilizing different types of knowledge graphs as training data, our model can semantically align the representations in vision and language with higher quality, and enhance the reasoning ability across scenarios and modalities. Extensive experiments on various vision-language downstream tasks demonstrate the effectiveness of Knowledge-CLIP compared with the original CLIP and competitive baselines.

引用

页数：16

共 50 条

[1] UniCLIP: Unified Framework for Contrastive Language-Image Pre-training
Lee, Janghyeon
Kim, Jongsuk
Shon, Hyounguk
Kim, Bumsoo
Kim, Seung Hwan
Lee, Honglak
Kim, Junmo
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[2] ARCHICLIP Enhanced Contrastive Language-Image Pre-training Model With Architectural Prior Knowledge
Xia, Shengtao
Cheng, Yiming
Tian, Runjia
[J]. PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE OF THE ASSOCIATION FOR COMPUTER-AIDED ARCHITECTURAL DESIGN RESEARCH IN ASIA, CAADRIA 2024, VOL 1, 2024, : 69 - 78
[3] Non-Contrastive Learning Meets Language-Image Pre-Training
Zhou, Jinghao
Dong, Li
Gan, Zhe
Wang, Lijuan
Wei, Furu
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 11028 - 11038
[4] Grounded Language-Image Pre-training
Li, Liunian Harold
Zhang, Pengchuan
Zhang, Haotian
Yang, Jianwei
Li, Chunyuan
Zhong, Yiwu
Wang, Lijuan
Yuan, Lu
Zhang, Lei
Hwang, Jenq-Neng
Chang, Kai-Wei
Gao, Jianfeng
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10955 - 10965
[5] iCLIP: Bridging Image Classification and Contrastive Language-Image Pre-training for Visual Recognition
Wei, Yixuan
Cao, Yue
Zhang, Zheng
Peng, Houwen
Yao, Zhuliang
Xie, Zhenda
Hue, Han
Guo, Baining
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2776 - 2786
[6] Data Determines Distributional Robustness in Contrastive Language-Image Pre-training (CLIP)
Fang, Alex
Ilharco, Gabriel
Wortsman, Mitchell
Wan, Yuhao
Shankar, Vaishaal
Dave, Achal
Schmidt, Ludwig
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[7] RA-CLIP: Retrieval Augmented Contrastive Language-Image Pre-training
Xie, Chen-Wei
Sun, Siyang
Xiong, Xiong
Zheng, Yun
Zhao, Deli
Zhou, Jingren
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 19265 - 19274
[8] Robust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor Attacks
Yang, Wenhan
Gao, Jingdong
Mirzasoleiman, Baharan
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[9] PMC-CLIP: Contrastive Language-Image Pre-training Using Biomedical Documents
Lin, Weixiong
Zhao, Ziheng
Zhang, Xiaoman
Wu, Chaoyi
Zhang, Ya
Wang, Yanfeng
Xie, Weidi
[J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT VIII, 2023, 14227 : 525 - 536
[10] Multimodal Hate Speech Detection in Memes Using Contrastive Language-Image Pre-Training
Arya, Greeshma
Hasan, Mohammad Kamrul
Bagwari, Ashish
Safie, Nurhizam
Islam, Shayla
Ahmed, Fatima Rayan Awad
De, Aaishani
Khan, Muhammad Attique
Ghazal, Taher M.
[J]. IEEE ACCESS, 2024, 12 : 22359 - 22375

← 1 2 3 4 5 →