MyVLM: Personalizing VLMs for User-Specific Queries

被引:0
|
作者
Alaluf, Yuval [1 ,2 ]
Richardson, Elad [2 ]
Tulyakov, Sergey [1 ]
Aberman, Kfir [1 ]
Cohen-Or, Daniel [1 ,2 ]
机构
[1] Snap Inc, Santa Monica, CA 90405 USA
[2] Tel Aviv Univ, Tel Aviv, Israel
来源
基金
以色列科学基金会;
关键词
Vision-Language Models; Personalization;
D O I
10.1007/978-3-031-72624-8_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent large-scale vision-language models (VLMs) have demonstrated remarkable capabilities in understanding and generating textual descriptions for visual content. However, these models lack an understanding of user-specific concepts. In this work, we take a first step toward the personalization of VLMs, enabling them to learn and reason over user-provided concepts. For example, we explore whether these models can learn to recognize you in an image and communicate what you are doing, tailoring the model to reflect your personal experiences and relationships. To effectively recognize a variety of user-specific concepts, we augment the VLM with external concept heads that function as toggles for the model, enabling the VLM to identify the presence of specific target concepts in a given image. Having recognized the concept, we learn a new concept embedding in the intermediate feature space of the VLM. This embedding is tasked with guiding the language model to naturally integrate the target concept in its generated response. We apply our technique to BLIP-2 and LLaVA for personalized image captioning and further show its applicability for personalized visual question-answering. Our experiments demonstrate our ability to generalize to unseen images of learned concepts while preserving the model behavior on unrelated inputs. Code and data will be made available upon acceptance.
引用
收藏
页码:73 / 91
页数:19
相关论文
共 50 条
  • [31] Offline optimization for user-specific hybrid recommender systems
    Simon Dooms
    Toon De Pessemier
    Luc Martens
    Multimedia Tools and Applications, 2015, 74 : 3053 - 3076
  • [32] Prosthesis-User-in-the-Loop: A User-Specific Biomechanical Modeling and Simulation Environment
    Wojtusch, J.
    Beckerle, P.
    Christ, O.
    Wolff, K.
    von Stryk, O.
    Rinderknecht, S.
    Vogt, J.
    2012 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2012, : 4181 - 4184
  • [33] User-Specific QoS Aware Scheduling and Implementation in Wireless Systems
    He, Chao
    Gitlin, Richard D.
    2015 Wireless Telecommunications Symposium (WTS), 2015,
  • [34] Measuring Effects of User-Specific Behaviour on Selection Tasks in HCI
    Bubalo, Nikola
    Honold, Frank
    Schuessel, Felix
    Weber, Michael
    Huckauf, Anke
    PROCEEDINGS OF THE 2016 SAI COMPUTING CONFERENCE (SAI), 2016, : 380 - 387
  • [35] User-Specific Skin Temperature-Aware DVFS for Smartphones
    Egilmez, Begum
    Memik, Gokhan
    Ogrenci-Memik, Seda
    Ergin, Oguz
    2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2015, : 1217 - 1220
  • [36] User-specific touch interfaces: a viable solution for an aging society?
    Shahal, Avner
    Spang, Robert P.
    Minge, Michael
    Trahms, Carola
    Voigt-Antons, Jan-Niklas
    BEHAVIOUR & INFORMATION TECHNOLOGY, 2022, 41 (09) : 1928 - 1940
  • [37] Extending document management systems with user-specific active properties
    Dourish, P
    Edwards, WK
    LaMarca, A
    Lamping, J
    Petersen, K
    Salisbury, M
    Terry, DB
    Thornton, J
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2000, 18 (02) : 140 - 170
  • [38] A Novel User-Specific Face and Palmprint Feature Level Fusion
    Fu, Yao
    Ma, Zhixing
    Qi, Miao
    Li, Jinsong
    Li, Xiaolu
    Lu, Yinghua
    2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL III, PROCEEDINGS, 2008, : 296 - 300
  • [39] Individuality and User-Specific Approach in Adaptive Emotion Recognition Model
    Yusuf, Rahadian
    Sharma, Dipak Gaire
    Tanev, Ivan
    Shimohara, Katsunori
    2017 INTERNATIONAL CONFERENCE ON BIOMETRICS AND KANSEI ENGINEERING (ICBAKE), 2017, : 1 - 6
  • [40] A user-specific passenger guidance system aimed at universal design
    Matsubara, Hiroshi
    Japanese Railway Engineering, 2005, (154): : 1 - 3