InteraRec: Interactive Recommendations Using Multimodal Large Language Models

被引:2
|
作者
Karra, Saketh Reddy [1 ]
Tulabandhula, Theja [1 ]
机构
[1] Univ Illinois, Chicago, IL 60607 USA
关键词
Large language models; Screenshots; User preferences; Recommendations;
D O I
10.1007/978-981-97-2650-9_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Numerous recommendation algorithms leverage weblogs, employing strategies such as collaborative filtering, content-based filtering, and hybrid methods to provide personalized recommendations to users. Weblogs, comprised of records detailing user activities on any website, offer valuable insights into user preferences, behavior, and interests. Despite the wealth of information weblogs provide, extracting relevant features requires extensive feature engineering. The intricate nature of the data also poses a challenge for interpretation, especially for non-experts. Additionally, they often fall short of capturing visual details and contextual nuances that influence user choices. In the present study, we introduce a sophisticated and interactive recommendation framework denoted as InteraRec, which diverges from conventional approaches that exclusively depend on weblogs for recommendation generation. This framework provides recommendations by capturing high-frequency screenshots of web pages as users navigate through a website. Leveraging advanced multimodal large language models (MLLMs), we extract valuable insights into user preferences from these screenshots by generating a user profile summary. Subsequently, we employ the InteraRec framework to extract relevant information from the summary to generate optimal recommendations. Through extensive experiments, we demonstrate the remarkable effectiveness of our recommendation system in providing users with valuable and personalized offerings.
引用
收藏
页码:32 / 43
页数:12
相关论文
共 50 条
  • [31] Woodpecker: hallucination correction for multimodal large language models
    Yin, Shukang
    Fu, Chaoyou
    Zhao, Sirui
    Xu, Tong
    Wang, Hao
    Sui, Dianbo
    Shen, Yunhang
    Li, Ke
    Sun, Xing
    Chen, Enhong
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (12)
  • [32] Do multimodal large language models understand welding?
    Khvatskii, Grigorii
    Lee, Yong Suk
    Angst, Corey
    Gibbs, Maria
    Landers, Robert
    Chawla, Nitesh V.
    INFORMATION FUSION, 2025, 120
  • [33] Woodpecker: hallucination correction for multimodal large language models
    Shukang YIN
    Chaoyou FU
    Sirui ZHAO
    Tong XU
    Hao WANG
    Dianbo SUI
    Yunhang SHEN
    Ke LI
    Xing SUN
    Enhong CHEN
    Science China(Information Sciences), 2024, 67 (12) : 52 - 64
  • [34] Do Multimodal Large Language Models and Humans Ground Language Similarly?
    Jones, Cameron R.
    Bergen, Benjamin
    Trott, Sean
    COMPUTATIONAL LINGUISTICS, 2024, 50 (04) : 1415 - 1440
  • [35] Computing Architecture for Large-Language Models (LLMs) and Large Multimodal Models (LMMs)
    Liang, Bor-Sung
    PROCEEDINGS OF THE 2024 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, ISPD 2024, 2024, : 233 - 234
  • [36] LMEye: An Interactive Perception Network for Large Language Models
    Li, Yunxin
    Hu, Baotian
    Chen, Xinyu
    Ma, Lin
    Xu, Yong
    Zhang, Min
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10952 - 10964
  • [37] Harnessing multimodal approaches for depression detection using large language models and facial expressions
    Misha Sadeghi
    Robert Richer
    Bernhard Egger
    Lena Schindler-Gmelch
    Lydia Helene Rupp
    Farnaz Rahimi
    Matthias Berking
    Bjoern M. Eskofier
    npj Mental Health Research, 3 (1):
  • [38] Interactive computer-aided diagnosis on medical image using large language models
    Sheng Wang
    Zihao Zhao
    Xi Ouyang
    Tianming Liu
    Qian Wang
    Dinggang Shen
    Communications Engineering, 3 (1):
  • [39] LLMR: Real-time Prompting of Interactive Worlds using Large Language Models
    De la Torre, Fernanda
    Fang, Cathy Mengying
    Huang, Han
    Banburski-Fahey, Andrzej
    Fernandez, Judith Amores
    Lanier, Jaron
    PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
  • [40] SEED-Bench: Benchmarking Multimodal Large Language Models
    Li, Bohao
    Ge, Yuying
    Ge, Yixiao
    Wang, Guangzhi
    Wang, Rui
    Zhang, Ruimao
    Shi, Ying
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13299 - 13308