MyVLM: Personalizing VLMs for User-Specific Queries

被引:0
|
作者
Alaluf, Yuval [1 ,2 ]
Richardson, Elad [2 ]
Tulyakov, Sergey [1 ]
Aberman, Kfir [1 ]
Cohen-Or, Daniel [1 ,2 ]
机构
[1] Snap Inc, Santa Monica, CA 90405 USA
[2] Tel Aviv Univ, Tel Aviv, Israel
来源
基金
以色列科学基金会;
关键词
Vision-Language Models; Personalization;
D O I
10.1007/978-3-031-72624-8_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent large-scale vision-language models (VLMs) have demonstrated remarkable capabilities in understanding and generating textual descriptions for visual content. However, these models lack an understanding of user-specific concepts. In this work, we take a first step toward the personalization of VLMs, enabling them to learn and reason over user-provided concepts. For example, we explore whether these models can learn to recognize you in an image and communicate what you are doing, tailoring the model to reflect your personal experiences and relationships. To effectively recognize a variety of user-specific concepts, we augment the VLM with external concept heads that function as toggles for the model, enabling the VLM to identify the presence of specific target concepts in a given image. Having recognized the concept, we learn a new concept embedding in the intermediate feature space of the VLM. This embedding is tasked with guiding the language model to naturally integrate the target concept in its generated response. We apply our technique to BLIP-2 and LLaVA for personalized image captioning and further show its applicability for personalized visual question-answering. Our experiments demonstrate our ability to generalize to unseen images of learned concepts while preserving the model behavior on unrelated inputs. Code and data will be made available upon acceptance.
引用
收藏
页码:73 / 91
页数:19
相关论文
共 50 条
  • [1] Personalizing Handwriting Recognition Systems with Limited User-Specific Samples
    Gold, Christian
    van den Boom, Dario
    Zesch, Torsten
    DOCUMENT ANALYSIS AND RECOGNITION, ICDAR 2021, PT IV, 2021, 12824 : 413 - 428
  • [2] User-Specific Perspectives on Ontologies
    Brochhausen, Mathias
    Slaughter, Laura
    Stenzhorn, Holger
    Graf, Norbert
    MEDICAL AND CARE COMPUNETICS 6, 2010, 156 : 114 - 121
  • [3] REPUTATIONAL RATING AND USER-SPECIFIC CONSENT
    Toscano, Gabriele
    REVISTA BOLIVIANA DE DERECHO, 2023, (36) : 46 - 55
  • [4] User-specific tool for project management
    不详
    HYDROCARBON PROCESSING, 1998, 77 (12): : 29 - 29
  • [5] USER-SPECIFIC WATER DEMAND ELASTICITIES
    SCHNEIDER, ML
    WHITLATCH, EE
    JOURNAL OF WATER RESOURCES PLANNING AND MANAGEMENT-ASCE, 1991, 117 (01): : 52 - 73
  • [6] Assessing user-specific difficulty of documents
    Paukkeri, Mari-Sanna
    Ollikainen, Marja
    Honkela, Timo
    INFORMATION PROCESSING & MANAGEMENT, 2013, 49 (01) : 198 - 212
  • [7] Evaluating the importance of user-specific profiling
    Wang, Z
    Rubin, N
    PROCEEDINGS OF THE 2ND USENIX WINDOWS NT SYMPOSIUM, 1998, : 21 - 30
  • [8] User-Specific Parameterization of Process Monitoring Systems
    B. Denkena
    H. Klemme
    J. Becker
    H. Blech
    Production Engineering, 2022, 16 : 735 - 742
  • [9] Learning user-specific parameters in a multibiometric system
    Jain, AK
    Ross, A
    2002 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL I, PROCEEDINGS, 2002, : 57 - 60
  • [10] Exploring User-Specific Information in Music Retrieval
    Cheng, Zhiyong
    Shen, Jialie
    Nie, Liqiang
    Chua, Tat-Seng
    Kankanhalli, Mohan
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 655 - 664