Prompt learning in computer vision: a survey

被引:2
|
作者
Lei, Yiming [1 ]
Li, Jingqi [1 ]
Li, Zilong [1 ]
Cao, Yuan [1 ]
Shan, Hongming [2 ,3 ,4 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200438, Peoples R China
[2] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Shanghai 200433, Peoples R China
[3] Fudan Univ, MOE Frontiers Ctr Brain Sci, Shanghai 200433, Peoples R China
[4] Shanghai Ctr Brain Sci & Brain Inspired Technol, Shanghai 201210, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金; 上海市自然科学基金;
关键词
Prompt learning; Visual prompt tuning (VPT); Image generation; Image classification; Artificial intelligence generated content (AIGC);
D O I
10.1631/FITEE.2300389
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Prompt learning has attracted broad attention in computer vision since the large pre-trained vision-language models (VLMs) exploded. Based on the close relationship between vision and language information built by VLM, prompt learning becomes a crucial technique in many important applications such as artificial intelligence generated content (AIGC). In this survey, we provide a progressive and comprehensive review of visual prompt learning as related to AIGC. We begin by introducing VLM, the foundation of visual prompt learning. Then, we review the vision prompt learning methods and prompt-guided generative models, and discuss how to improve the efficiency of adapting AIGC models to specific downstream tasks. Finally, we provide some promising research directions concerning prompt learning.
引用
收藏
页码:42 / 63
页数:22
相关论文
共 50 条
  • [41] Attention mechanisms in computer vision: A survey
    Guo, Meng-Hao
    Xu, Tian-Xing
    Liu, Jiang-Jiang
    Liu, Zheng-Ning
    Jiang, Peng-Tao
    Mu, Tai-Jiang
    Zhang, Song-Hai
    Martin, Ralph R.
    Cheng, Ming-Ming
    Hu, Shi-Min
    [J]. COMPUTATIONAL VISUAL MEDIA, 2022, 8 (03) : 331 - 368
  • [42] A Historical Survey of Geometric Computer Vision
    Sturm, Peter
    [J]. COMPUTER ANALYSIS OF IMAGES AND PATTERNS: 14TH INTERNATIONAL CONFERENCE, CAIP 2011, PT I, 2011, 6854 : 1 - 8
  • [43] Context understanding in computer vision: A survey
    Wang, Xuan
    Zhu, Zhigang
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2023, 229
  • [44] A Comprehensive Survey of Transformers for Computer Vision
    Jamil, Sonain
    Piran, Md. Jalil
    Kwon, Oh-Jin
    [J]. DRONES, 2023, 7 (05)
  • [45] Fashion Meets Computer Vision: A Survey
    Cheng, Wen-Huang
    Song, Sijie
    Chen, Chieh-Yun
    Hidayati, Shintami Chusnul
    Liu, Jiaying
    [J]. ACM COMPUTING SURVEYS, 2021, 54 (04)
  • [46] Markov Random Field modeling, inference & learning in computer vision & image understanding: A survey
    Wang, Chaohui
    Komodakis, Nikos
    Paragios, Nikos
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2013, 117 (11) : 1610 - 1627
  • [47] Survey of Vision Transformer in Low-Level Computer Vision
    Zhu, Kai
    Li, Li
    Zhang, Tong
    Jiang, Sheng
    Bie, Yiming
    [J]. Computer Engineering and Applications, 60 (04): : 39 - 56
  • [49] A Survey on Computer Vision Architectures for Large Scale Image Classification using Deep Learning
    Himabindu, D. Dakshayani
    Kumar, S. Praveen
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (10) : 105 - 120
  • [50] A Retrospect to Multi-prompt Learning across Vision and Language
    Chen, Ziliang
    Huang, Xin
    Guan, Quanlong
    Lin, Liang
    Luo, Weiqi
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 22133 - 22144