Survey on Vision-language Pre-training

被引:0
|
作者
Yin, Jiong [1 ]
Zhang, Zhe-Dong [3 ]
Gao, Yu-Han [2 ,3 ]
Yang, Zhi-Wen [1 ]
Li, Liang [4 ]
Xiao, Mang [5 ]
Sun, Yao-Qi [3 ]
Yan, Cheng-Gang [3 ]
机构
[1] College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou,310018, China
[2] Lishui Institute of Hangzhou Dianzi University, Lishui,323000, China
[3] School of Automation, Hangzhou Dianzi University, Hangzhou,210016, China
[4] Institute of Computing Technology, Chinese Academy of Sciences, Beijing,100190, China
[5] Sir Run Run Shaw Hospital, College of Medicine, Zhejiang University, Hangzhou,310016, China
来源
Ruan Jian Xue Bao/Journal of Software | 2023年 / 34卷 / 05期
关键词
Compilation and indexing terms; Copyright 2024 Elsevier Inc;
D O I
10.13328/j.cnki.jos.006774
中图分类号
学科分类号
摘要
Learning algorithms - Learning systems - Natural language processing systems
引用
收藏
页码:2000 / 2023
相关论文
共 50 条
  • [1] VLP: A Survey on Vision-language Pre-training
    Chen, Fei-Long
    Zhang, Du-Zhen
    Han, Ming-Lun
    Chen, Xiu-Yi
    Shi, Jing
    Xu, Shuang
    Xu, Bo
    [J]. MACHINE INTELLIGENCE RESEARCH, 2023, 20 (01) : 38 - 56
  • [2] VLP: A Survey on Vision-language Pre-training
    Fei-Long Chen
    Du-Zhen Zhang
    Ming-Lun Han
    Xiu-Yi Chen
    Jing Shi
    Shuang Xu
    Bo Xu
    [J]. Machine Intelligence Research, 2023, 20 (01) : 38 - 56
  • [3] VLP: A Survey on Vision-language Pre-training
    Fei-Long Chen
    Du-Zhen Zhang
    Ming-Lun Han
    Xiu-Yi Chen
    Jing Shi
    Shuang Xu
    Bo Xu
    [J]. Machine Intelligence Research, 2023, 20 : 38 - 56
  • [4] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
    Jian, Yiren
    Gao, Chongyang
    Vosoughi, Soroush
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Pre-training A Prompt Pool for Vision-Language Model
    Liu, Jun
    Gu, Yang
    Yang, Zhaohua
    Guo, Shuai
    Liu, Huaqiu
    Chen, Yiqiang
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [6] Contrastive Vision-Language Pre-training with Limited Resources
    Cui, Quan
    Zhou, Boyan
    Guo, Yu
    Yin, Weidong
    Wu, Hao
    Yoshie, Osamu
    Chen, Yubo
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVI, 2022, 13696 : 236 - 253
  • [7] Vision-language pre-training via modal interaction
    Cheng, Hang
    Ye, Hehui
    Zhou, Xiaofei
    Liu, Ximeng
    Chen, Fei
    Wang, Meiqing
    [J]. PATTERN RECOGNITION, 2024, 156
  • [8] Vision-Language Pre-Training with Triple Contrastive Learning
    Yang, Jinyu
    Duan, Jiali
    Tran, Son
    Xu, Yi
    Chanda, Sampath
    Chen, Liqun
    Zeng, Belinda
    Chilimbi, Trishul
    Huang, Junzhou
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15650 - 15659
  • [9] Vision-Language Pre-Training for Boosting Scene Text Detectors
    Song, Sibo
    Wan, Jianqiang
    Yang, Zhibo
    Tang, Jun
    Cheng, Wenqing
    Bai, Xiang
    Yao, Cong
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 15660 - 15670
  • [10] Filtering, Distillation, and Hard Negatives for Vision-Language Pre-Training
    Radenovic, Filip
    Dubey, Abhimanyu
    Kadian, Abhishek
    Mihaylov, Todor
    Vandenhende, Simon
    Patel, Yash
    Wen, Yi
    Ramanathan, Vignesh
    Mahajan, Dhruv
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 6967 - 6977