Interpreting Art by Leveraging Pre-Trained Models

被引:0
|
作者
Penzel, Niklas [1 ]
Denzler, Joachim [1 ]
机构
[1] Friedrich Schiller Univ Jena, Ernst Abbe Pl 2, D-07743 Jena, Germany
关键词
D O I
10.23919/MVA57639.2023.10216010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many domains, so-called foundation models were recently proposed. These models are trained on immense amounts of data resulting in impressive performances on various downstream tasks and benchmarks. Later works focus on leveraging this pre-trained knowledge by combining these models. To reduce data and compute requirements, we utilize and combine foundation models in two ways. First, we use language and vision models to extract and generate a challenging language vision task in the form of artwork interpretation pairs. Second, we combine and fine-tune CLIP as well as GPT-2 to reduce compute requirements for training interpretation models. We perform a qualitative and quantitative analysis of our data and conclude that generating artwork leads to improvements in visual-text alignment and, therefore, to more proficient interpretation models(1). Our approach addresses how to leverage and combine pre-trained models to tackle tasks where existing data is scarce or difficult to obtain.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Leveraging Pre-trained Language Models for Gender Debiasing
    Jain, Nishtha
    Popovic, Maja
    Groves, Declan
    Specia, Lucia
    [J]. LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 2188 - 2195
  • [2] Leveraging pre-trained language models for code generation
    Soliman, Ahmed
    Shaheen, Samir
    Hadhoud, Mayada
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (03) : 3955 - 3980
  • [3] Leveraging Pre-trained BERT for Audio Captioning
    Liu, Xubo
    Mei, Xinhao
    Huang, Qiushi
    Sun, Jianyuan
    Zhao, Jinzheng
    Liu, Haohe
    Plumbley, Mark D.
    Kilic, Volkan
    Wang, Wenwu
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 1145 - 1149
  • [4] Leveraging Pre-Trained Embeddings for Welsh Taggers
    Ezeani, Ignatius M.
    Piao, Scott
    Neale, Steven
    Rayson, Paul
    Knight, Dawn
    [J]. 4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 270 - 280
  • [5] Leveraging pre-trained language models for mining microbiome-disease relationships
    Karkera, Nikitha
    Acharya, Sathwik
    Palaniappan, Sucheendra K.
    [J]. BMC BIOINFORMATICS, 2023, 24 (01)
  • [6] Leveraging pre-trained language models for mining microbiome-disease relationships
    Nikitha Karkera
    Sathwik Acharya
    Sucheendra K. Palaniappan
    [J]. BMC Bioinformatics, 24
  • [7] CreativeBot: a Creative Storyteller Agent Developed by Leveraging Pre-trained Language Models
    Elgarf, Maha
    Peters, Christopher
    [J]. 2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 13438 - 13444
  • [8] Leveraging Pre-trained CNN Models for Skeleton-Based Action Recognition
    Laraba, Sohaib
    Tilmanne, Joelle
    Dutoit, Thierry
    [J]. COMPUTER VISION SYSTEMS (ICVS 2019), 2019, 11754 : 612 - 626
  • [9] Leveraging pre-trained Segmentation Networks for Anomaly Segmentation
    Rippel, Oliver
    Merhof, Dorit
    [J]. 2021 26TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2021,
  • [10] Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
    Rothe, Sascha
    Narayan, Shashi
    Severyn, Aliaksei
    [J]. TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2020, 8 : 264 - 280