Prompt learning in computer vision: a survey

被引:2
|
作者
Lei, Yiming [1 ]
Li, Jingqi [1 ]
Li, Zilong [1 ]
Cao, Yuan [1 ]
Shan, Hongming [2 ,3 ,4 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai Key Lab Intelligent Informat Proc, Shanghai 200438, Peoples R China
[2] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Shanghai 200433, Peoples R China
[3] Fudan Univ, MOE Frontiers Ctr Brain Sci, Shanghai 200433, Peoples R China
[4] Shanghai Ctr Brain Sci & Brain Inspired Technol, Shanghai 201210, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金; 上海市自然科学基金;
关键词
Prompt learning; Visual prompt tuning (VPT); Image generation; Image classification; Artificial intelligence generated content (AIGC);
D O I
10.1631/FITEE.2300389
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Prompt learning has attracted broad attention in computer vision since the large pre-trained vision-language models (VLMs) exploded. Based on the close relationship between vision and language information built by VLM, prompt learning becomes a crucial technique in many important applications such as artificial intelligence generated content (AIGC). In this survey, we provide a progressive and comprehensive review of visual prompt learning as related to AIGC. We begin by introducing VLM, the foundation of visual prompt learning. Then, we review the vision prompt learning methods and prompt-guided generative models, and discuss how to improve the efficiency of adapting AIGC models to specific downstream tasks. Finally, we provide some promising research directions concerning prompt learning.
引用
收藏
页码:42 / 63
页数:22
相关论文
共 50 条
  • [21] Deep learning based 3D segmentation in computer vision: A survey
    He, Yong
    Yu, Hongshan
    Liu, Xiaoyan
    Yang, Zhengeng
    Sun, Wei
    Anwar, Saeed
    Mian, Ajmal
    [J]. Information Fusion, 2025, 115
  • [22] A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets
    Khaled Bayoudh
    Raja Knani
    Fayçal Hamdaoui
    Abdellatif Mtibaa
    [J]. The Visual Computer, 2022, 38 : 2939 - 2970
  • [23] Machine learning and computer vision techniques in continuous beehive monitoring applications: A survey
    Bilik, Simon
    Zemcik, Tomas
    Kratochvila, Lukas
    Ricanek, Dominik
    Richter, Miloslav
    Zambanini, Sebastian
    Horak, Karel
    [J]. COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2024, 217
  • [24] Conditional Prompt Learning for Vision-Language Models
    Zhou, Kaiyang
    Yang, Jingkang
    Loy, Chen Change
    Liu, Ziwei
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16795 - 16804
  • [25] A survey on deep multimodal learning for computer vision: advances, trends, applications, and datasets
    Bayoudh, Khaled
    Knani, Raja
    Hamdaoui, Faycal
    Mtibaa, Abdellatif
    [J]. VISUAL COMPUTER, 2022, 38 (08): : 2939 - 2970
  • [26] Deep Learning Advances in Computer Vision with 3D Data: A Survey
    Ioannidou, Anastasia
    Chatzilari, Elisavet
    Nikolopoulos, Spiros
    Kompatsiaris, Ioannis
    [J]. ACM COMPUTING SURVEYS, 2017, 50 (02)
  • [27] Machine learning and computer vision techniques in continuous beehive monitoring applications: A survey
    Bilik, Simon
    Zemcik, Tomas
    Kratochvila, Lukas
    Ricanek, Dominik
    Richter, Miloslav
    Zambanini, Sebastian
    Horak, Karel
    [J]. Computers and Electronics in Agriculture, 2024, 217
  • [28] Learning to Prompt for Vision-Language Emotion Recognition
    Xie, Hongxia
    Chung, Hua
    Shuai, Hong-Han
    Cheng, Wen-Huang
    [J]. 2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION WORKSHOPS AND DEMOS, ACIIW, 2023,
  • [29] Machine learning in computer vision
    Esposito, F
    Malerba, D
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2001, 15 (08) : 693 - 705
  • [30] A survey of Optimal Transport for Computer Graphics and Computer Vision
    Bonneel, Nicolas
    Digne, Julie
    [J]. COMPUTER GRAPHICS FORUM, 2023, 42 (02) : 439 - 460