MyVLM: Personalizing VLMs for User-Specific Queries

被引:0
|
作者
Alaluf, Yuval [1 ,2 ]
Richardson, Elad [2 ]
Tulyakov, Sergey [1 ]
Aberman, Kfir [1 ]
Cohen-Or, Daniel [1 ,2 ]
机构
[1] Snap Inc, Santa Monica, CA 90405 USA
[2] Tel Aviv Univ, Tel Aviv, Israel
来源
基金
以色列科学基金会;
关键词
Vision-Language Models; Personalization;
D O I
10.1007/978-3-031-72624-8_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent large-scale vision-language models (VLMs) have demonstrated remarkable capabilities in understanding and generating textual descriptions for visual content. However, these models lack an understanding of user-specific concepts. In this work, we take a first step toward the personalization of VLMs, enabling them to learn and reason over user-provided concepts. For example, we explore whether these models can learn to recognize you in an image and communicate what you are doing, tailoring the model to reflect your personal experiences and relationships. To effectively recognize a variety of user-specific concepts, we augment the VLM with external concept heads that function as toggles for the model, enabling the VLM to identify the presence of specific target concepts in a given image. Having recognized the concept, we learn a new concept embedding in the intermediate feature space of the VLM. This embedding is tasked with guiding the language model to naturally integrate the target concept in its generated response. We apply our technique to BLIP-2 and LLaVA for personalized image captioning and further show its applicability for personalized visual question-answering. Our experiments demonstrate our ability to generalize to unseen images of learned concepts while preserving the model behavior on unrelated inputs. Code and data will be made available upon acceptance.
引用
收藏
页码:73 / 91
页数:19
相关论文
共 50 条
  • [21] Interactive Recommendation with User-Specific Deep Reinforcement Learning
    Lei, Yu
    Li, Wenjie
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2019, 13 (06)
  • [22] DepressionFeature: Underlying ontology for user-specific depression analysis
    Dalal, Sumit
    Jain, Sarika
    Dave, Mayank
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [23] User-Specific Web-Based Route Planning
    Pressl, Bettina
    Mader, Christoph
    Wieser, Manfred
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, PT 1, 2010, 6179 : 280 - +
  • [24] AUTOMATIC GENERATION OF USER-SPECIFIC NETWORK PROGRAM SYSTEMS
    KOMARNICKI, J
    ANGEWANDTE INFORMATIK, 1975, (02): : 70 - 74
  • [25] Online Learning of User-Specific Destination Prediction Models
    Davami, Erfan
    Sukthankar, Gita
    PROCEEDINGS OF THE 2012 ASE INTERNATIONAL CONFERENCE ON SOCIAL INFORMATICS (SOCIALINFORMATICS 2012), 2012, : 40 - 43
  • [26] Offline optimization for user-specific hybrid recommender systems
    Dooms, Simon
    De Pessemier, Toon
    Martens, Luc
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (09) : 3053 - 3076
  • [27] A user-specific trusted virtual environment for cloud computing
    He, R., 1905, Asian Network for Scientific Information (12):
  • [28] A comparison of intraoral spectrophotometers-Are there user-specific differences?
    Blum, Sam Lennert
    Horn, Matthias
    Olms, Constanze
    JOURNAL OF ESTHETIC AND RESTORATIVE DENTISTRY, 2018, 30 (05) : 442 - 448
  • [29] Online optimization for user-specific hybrid recommender systems
    Dooms, Simon
    De Pessemier, Toon
    Martens, Luc
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (24) : 11297 - 11329
  • [30] A framework supporting user-specific services in RFID systems
    Chen, Chin-Ling
    FIFTH IEEE INTERNATIONAL CONFERENCE ON WIRELESS, MOBILE AND UBIQUITOUS TECHNOLOGIES IN EDUCATION, PROCEEDINGS, 2008, : 182 - 184