An Efficient Distributed-Computing Framework for Association-Rule-Based Recommendation

被引:0
|
作者
Li C.-S. [1 ]
Wu Z.-A. [1 ,2 ]
Zhang L. [1 ,2 ]
Cao J. [1 ,2 ]
机构
[1] School of Information Engineering, Nanjing University of Finance and Economics, Nanjing
[2] Jiangsu Provincial Key Laboratory of E-Business, Nanjing University of Finance and Economics, Nanjing
来源
基金
中国国家自然科学基金;
关键词
Association rules; FP-growth; Frequent patterns; Load balancing; Recommender systems; Spark;
D O I
10.11897/SP.J.1016.2019.01218
中图分类号
学科分类号
摘要
The association rule based recommendation model is one of the most widely used commercial recommendation engines in e-commerce websites. A variety of techniques, mainly including eligible rule selection and multiple rules combination, have been developed to create effective recommendation. Unfortunately, the efficiency of the association rule based recommendation has been paid for little attention. In real-life online shopping sites, the concurrency traffic is usually very high, that is, there are a vast amount of users visiting sites simultaneously and persistently adding commodities into their baskets. In the meanwhile, the volume of frequent patterns are usually very large and thus the number of association rules derived from these patterns is much larger because a pattern is able to generate several rules. As the large amount of the association rules and the concurrent access of users, how to match the browsing histories of users with the large set of rules efficiently in order to offer nearly real-time recommendations for massive online users, has become a vital concern which restricts whether the association rule based recommendation model could be used on the real-life e-commerce websites. To address this problem, this paper focuses on the efficiency of the association rule based recommendation and develops a distributed computing framework for improving the computational performance of rule based recommendation, which seamlessly fuses the association rule mining and the recommendation computing. Firstly, based on the summarization of existing rule-based approaches, a tree-type structure called Ordered Patterns Forest (OPF) is designed for the compact representation and storage of frequent patterns, without missing any basic information that will be useful for the subsequent recommendation such as support of a pattern and its nested patterns. Secondly, we transform the two-step rule based recommendation to a series of operations on the data structure, and then develop the corresponding efficient algorithms for these operations which are responsible for mining eligible patterns as well as computing recommendation scores. Specifically, we transform the candidate rules mining into a path searching problem on the OPF and thus present a path searching algorithm running on the single machine. Finally, the real-time recommendation is impossible to be completed in a single machine. Hence, in order to handle the ever-increasing of online customers and patterns, we devise a distributed computing framework in which a novel load balanced strategy for data partitioning is also proposed to reduce the running time of the task that finishes lastly and thus further improves the overall performance. At last, we implement the proposed framework and algorithms on Spark, a widely used memory-based distributed computing engine, and evaluate the framework and algorithms on three real-world datasets, i.e., Accidents, Webdocs and Amazon. The experimental results demonstrate that the efficiency improved by the proposed OPF with the path searching algorithm is more than six times that of the traditional brute force method. Moreover, the proposed load balanced strategy is effective to further improve the performance of the proposed distributed association rule based recommendation framework, which can achieve nearly linear scalability along with the increase of computational nodes. © 2019, Science Press. All right reserved.
引用
收藏
页码:1218 / 1231
页数:13
相关论文
共 34 条
  • [1] Zhang Y.-J., Du Y.-L., Meng X.-W., Research on group recommender systems and their applications, Chinese Journal of Computers, 39, 4, pp. 745-764, (2016)
  • [2] Gan M., Jiang R., FLOWER: Fusing global and local associations towards personalized social recommendation, Future Generation Computer Systems, 78, pp. 462-473, (2018)
  • [3] Adomavicius G., Tuzhilin A., Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions, IEEE Transactions on Knowledge and Data Engineering, 17, 6, pp. 734-749, (2005)
  • [4] Liu S.-D., Meng X.-W., Recommender system in location-based social networks, Chinese Journal of Computers, 38, 2, pp. 322-336, (2015)
  • [5] Linden G., Smith B., York J., Amazon. Com Recommendations: Item-To-Item Collaborative Filtering, IEEE Internet Computing, 7, 1, pp. 76-80, (2003)
  • [6] Nakagawa M., Mobasher B., A hybrid web personalization model based on site connectivity, Proceedings of the 5th International WebKDD Workshop Web Mining as a Premise to Effective and Intelligent Web Applications, pp. 59-70, (2003)
  • [7] Davidson J., Liebald B., Liu J., Et al., The YouTube video recommendation system, Proceedings of the 4th ACM Conference on Recommender Systems, pp. 293-296, (2010)
  • [8] Sandvig J.J., Mobasher B., Burke R., Robustness of collaborative recommendation based on association rule mining, Proceedings of the 1st ACM Conference on Recommender Systems, pp. 105-112, (2007)
  • [9] Zaiane O.R., Building a recommender agent for e-learning systems, Proceedings of the International Conference on Computers in Education, pp. 55-59, (2002)
  • [10] Mobasher B., Dai H., Luo T., Et al., Effective personalization based on association rule discovery from web usage data, Proceedings of the 3rd International Workshop on Web Information and Data Management, pp. 9-15, (2001)