InteraRec: Interactive Recommendations Using Multimodal Large Language Models

被引：2

作者：

Karra, Saketh Reddy ^{[1
]}

Tulabandhula, Theja ^{[1
]}

机构：

[1] Univ Illinois, Chicago, IL 60607 USA

来源：

TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2024 WORKSHOPS, RAFDA AND IWTA | 2024年 / 14658卷

关键词：

Large language models; Screenshots; User preferences; Recommendations;

D O I：

10.1007/978-981-97-2650-9_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Numerous recommendation algorithms leverage weblogs, employing strategies such as collaborative filtering, content-based filtering, and hybrid methods to provide personalized recommendations to users. Weblogs, comprised of records detailing user activities on any website, offer valuable insights into user preferences, behavior, and interests. Despite the wealth of information weblogs provide, extracting relevant features requires extensive feature engineering. The intricate nature of the data also poses a challenge for interpretation, especially for non-experts. Additionally, they often fall short of capturing visual details and contextual nuances that influence user choices. In the present study, we introduce a sophisticated and interactive recommendation framework denoted as InteraRec, which diverges from conventional approaches that exclusively depend on weblogs for recommendation generation. This framework provides recommendations by capturing high-frequency screenshots of web pages as users navigate through a website. Leveraging advanced multimodal large language models (MLLMs), we extract valuable insights into user preferences from these screenshots by generating a user profile summary. Subsequently, we employ the InteraRec framework to extract relevant information from the summary to generate optimal recommendations. Through extensive experiments, we demonstrate the remarkable effectiveness of our recommendation system in providing users with valuable and personalized offerings.

引用

页码：32 / 43

页数：12

共 50 条

[31] Woodpecker: hallucination correction for multimodal large language models
Yin, Shukang
Fu, Chaoyou
Zhao, Sirui
Xu, Tong
Wang, Hao
Sui, Dianbo
Shen, Yunhang
Li, Ke
Sun, Xing
Chen, Enhong
SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (12)
[32] Do multimodal large language models understand welding?
Khvatskii, Grigorii
Lee, Yong Suk
Angst, Corey
Gibbs, Maria
Landers, Robert
Chawla, Nitesh V.
INFORMATION FUSION, 2025, 120
[33] Woodpecker: hallucination correction for multimodal large language models
Shukang YIN
Chaoyou FU
Sirui ZHAO
Tong XU
Hao WANG
Dianbo SUI
Yunhang SHEN
Ke LI
Xing SUN
Enhong CHEN
Science China(Information Sciences), 2024, 67 (12) : 52 - 64
[34] Do Multimodal Large Language Models and Humans Ground Language Similarly?
Jones, Cameron R.
Bergen, Benjamin
Trott, Sean
COMPUTATIONAL LINGUISTICS, 2024, 50 (04) : 1415 - 1440
[35] Computing Architecture for Large-Language Models (LLMs) and Large Multimodal Models (LMMs)
Liang, Bor-Sung
PROCEEDINGS OF THE 2024 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, ISPD 2024, 2024, : 233 - 234
[36] LMEye: An Interactive Perception Network for Large Language Models
Li, Yunxin
Hu, Baotian
Chen, Xinyu
Ma, Lin
Xu, Yong
Zhang, Min
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 10952 - 10964
[37] Harnessing multimodal approaches for depression detection using large language models and facial expressions
Misha Sadeghi
Robert Richer
Bernhard Egger
Lena Schindler-Gmelch
Lydia Helene Rupp
Farnaz Rahimi
Matthias Berking
Bjoern M. Eskofier
npj Mental Health Research, 3 (1):
[38] Interactive computer-aided diagnosis on medical image using large language models
Sheng Wang
Zihao Zhao
Xi Ouyang
Tianming Liu
Qian Wang
Dinggang Shen
Communications Engineering, 3 (1):
[39] LLMR: Real-time Prompting of Interactive Worlds using Large Language Models
De la Torre, Fernanda
Fang, Cathy Mengying
Huang, Han
Banburski-Fahey, Andrzej
Fernandez, Judith Amores
Lanier, Jaron
PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
[40] SEED-Bench: Benchmarking Multimodal Large Language Models
Li, Bohao
Ge, Yuying
Ge, Yixiao
Wang, Guangzhi
Wang, Rui
Zhang, Ruimao
Shi, Ying
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 13299 - 13308

← 1 2 3 4 5 →