Multimodal Recommendation Method Integrating Latent Structures and Semantic Information

被引:0
|
作者
Zhang X. [1 ]
Liang Z. [1 ]
Yao C. [1 ]
Li Z. [1 ]
机构
[1] Institutes of Physical Science and Information Technology, Anhui University, Hefei
关键词
Contrastive Learning; Graph Neural Network; Multimodal Recommender System; Recommender System;
D O I
10.16451/j.cnki.issn1003-6059.202403004
中图分类号
学科分类号
摘要
Multimodal recommender systems aim to improve recommendation performance via multimodal information such as text and visual information. However, existing systems usually integrate multimodal semantic information into item representations or utilize multimodal features to search the latent structure without fully exploiting the correlation between them. Therefore, a multimodal recommendation method integrating latent structures and semantic information is proposed. Based on user's historical behavior and multimodal features, user-user and item-item graphs are constructed to search the latent structure, and user-item bipartite graphs are built to learn the user's historical behavior. The graph convolutional neural network is utilized to learn the topological structure of different graphs. To better integrate latent structures and semantic information, contrastive learning is employed to align the learned latent structure representations of item with their multimodal original features. Finally, evaluation experiments on three datasets demonstrate the effectiveness of the proposed method. © 2024 Science Press. All rights reserved.
引用
收藏
页码:231 / 241
页数:10
相关论文
共 25 条
  • [1] Wu S.W., Sun F., Zhang W.T., Et al., Graph Neural Networks in Recommender Systems: A Survey, ACM Computing Surveys, 55, 5, (2022)
  • [2] He R.N., McAuley J., VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback, Proc of the 30th AAAI Conference on Artificial Intelligence, pp. 144-150, (2016)
  • [3] Liu Q., Wu S., Wang L., DeepStyle: Learning User Preferences for Visual Recommendation, Proc of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 841-844, (2017)
  • [4] Wang X., He X.N., Wang M., Et al., Neural Graph Collaborative Filtering, Proc of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 165-174, (2019)
  • [5] He X.N., Deng K., Wang X., Et al., LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation, Proc of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 639-648, (2020)
  • [6] Wei Y.W., Wang X., Nie L.Q., Et al., MMGCN: Multi-modal Graph Convolution Network for Personalized Recommendation of Micro-Video, Proc of the 27th ACM International Conference on Multimedia, pp. 1437-1445, (2019)
  • [7] Wei Y.W., Wang X., Nie L.Q., Et al., Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback, Proc of the 28th ACM International Conference on Multimedia, pp. 3541-3549, (2020)
  • [8] Wang Q.F., Wei Y.W., Yin J.H., Et al., DualGNN: Dual Graph Neural Network for Multimedia Recommendation, IEEE Transactions on Multimedia, 25, pp. 1074-1084, (2023)
  • [9] Zhang J.H., Zhu Y.Q., Liu Q., Et al., Mining Latent Structures for Multimedia Recommendation, Proc of the 29th ACM International Conference on Multimedia, pp. 3872-3880, (2021)
  • [10] Zhou X., Shen Z.Q., A Tale of Two Graphs: Freezing and Denoising Graph Structures for Multimodal Recommendation, Proc of the 31st ACM International Conference on Multimedia, pp. 935-943, (2023)