InteraRec: Interactive Recommendations Using Multimodal Large Language Models

被引：2

作者：

Karra, Saketh Reddy ^{[1
]}

Tulabandhula, Theja ^{[1
]}

机构：

[1] Univ Illinois, Chicago, IL 60607 USA

来源：

TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2024 WORKSHOPS, RAFDA AND IWTA | 2024年 / 14658卷

关键词：

Large language models; Screenshots; User preferences; Recommendations;

D O I：

10.1007/978-981-97-2650-9_3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Numerous recommendation algorithms leverage weblogs, employing strategies such as collaborative filtering, content-based filtering, and hybrid methods to provide personalized recommendations to users. Weblogs, comprised of records detailing user activities on any website, offer valuable insights into user preferences, behavior, and interests. Despite the wealth of information weblogs provide, extracting relevant features requires extensive feature engineering. The intricate nature of the data also poses a challenge for interpretation, especially for non-experts. Additionally, they often fall short of capturing visual details and contextual nuances that influence user choices. In the present study, we introduce a sophisticated and interactive recommendation framework denoted as InteraRec, which diverges from conventional approaches that exclusively depend on weblogs for recommendation generation. This framework provides recommendations by capturing high-frequency screenshots of web pages as users navigate through a website. Leveraging advanced multimodal large language models (MLLMs), we extract valuable insights into user preferences from these screenshots by generating a user profile summary. Subsequently, we employ the InteraRec framework to extract relevant information from the summary to generate optimal recommendations. Through extensive experiments, we demonstrate the remarkable effectiveness of our recommendation system in providing users with valuable and personalized offerings.

引用

页码：32 / 43

页数：12

共 50 条

[21] Large language models and multimodal foundation models for precision oncology
Truhn, Daniel
Eckardt, Jan-Niklas
Ferber, Dyke
Kather, Jakob Nikolas
NPJ PRECISION ONCOLOGY, 2024, 8 (01)
[22] Large language models and multimodal foundation models for precision oncology
Daniel Truhn
Jan-Niklas Eckardt
Dyke Ferber
Jakob Nikolas Kather
npj Precision Oncology, 8
[23] PromptMTopic: Unsupervised Multimodal Topic Modeling of Memes using Large Language Models
Prakash, Nirmalendu
Wang, Han
Hoang, Nguyen Khoi
Hee, Ming Shan
Lee, Roy Ka-Wei
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 621 - 631
[24] ReactGenie: A Development Framework for Complex Multimodal Interactions Using Large Language Models
Yang, Jackie Junrui
Shi, Yingtian
Zhang, Yuhan
Li, Karina
Rosli, Daniel Wan
Jain, Anisha
Zhang, Shuning
Li, Tianshi
Landay, James A.
Lam, Monica S.
PROCEEDINGS OF THE 2024 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYTEMS (CHI 2024), 2024,
[25] Instruction Tuning Large Language Models for Multimodal Relation Extraction Using LoRA
Li, Zou
Pang, Ning
Zhao, Xiang
WEB INFORMATION SYSTEMS AND APPLICATIONS, WISA 2024, 2024, 14883 : 364 - 376
[26] Can We Edit Multimodal Large Language Models?
Cheng, Siyuan
Tian, Bozhong
Liu, Qingbin
Chen, Xi
Wang, Yongheng
Chen, Huajun
Zhang, Ningyu
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13877 - 13888
[27] Contextual Object Detection with Multimodal Large Language Models
Zang, Yuhang
Li, Wei
Han, Jun
Zhou, Kaiyang
Loy, Chen Change
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2025, 133 (02) : 825 - 843
[28] Investigating the Catastrophic Forgetting in Multimodal Large Language Models
Zhai, Yuexiang
Tong, Shengbang
Li, Xiao
Cai, Mu
Qu, Qing
Lee, Yong Jae
Ma, Yi
CONFERENCE ON PARSIMONY AND LEARNING, VOL 234, 2024, 234 : 202 - 227
[29] A Survey on Multimodal Large Language Models for Autonomous Driving
Cui, Can
Ma, Yunsheng
Cao, Xu
Ye, Wenqian
Zhou, Yang
Liang, Kaizhao
Chen, Jintai
Lu, Juanwu
Yang, Zichong
Liao, Kuei-Da
Gao, Tianren
Li, Erlong
Tang, Kun
Cao, Zhipeng
Zhou, Tong
Liu, Ao
Yan, Xinrui
Mei, Shuqi
Cao, Jianguo
Wang, Ziran
Zheng, Chao
2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 958 - 979
[30] Multimodal Food Image Classification with Large Language Models
Kim, Jun-Hwa
Kim, Nam-Ho
Jo, Donghyeok
Won, Chee Sun
ELECTRONICS, 2024, 13 (22)

← 1 2 3 4 5 →