Frequent itemsets mining algorithm based on sort tree

被引:0
|
作者
Wang H.-M. [1 ,2 ]
Dang Y.-Y. [1 ]
Hu M. [1 ]
Liu D.-Y. [2 ,3 ]
机构
[1] School of Computer Science and Engineering, Changchun University of Technology, Changchun
[2] College of Computer Science and Technology, Jilin University, Changchun
[3] Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun
关键词
Ancestor-brother express; Artifical intelligence; Frequent itemsets; Last term pruning; Sort tree;
D O I
10.13229/j.cnki.jdxbgxb201604030
中图分类号
学科分类号
摘要
In this paper, the concept of sort tree is proposed, and the sort tree is used to store frequent itemsets, which proves the property of the last item pruning. Joining and pruning operations a re implemented as Apriori algorithm with O(1) time complexity. Ancestor-brother express is used to store the sort tree. If an ancestor does not exist in a transaction, all the brother nodes with the same ancestor will be skipped. So the time performance of counting support can be improved. Theoretical analysis and experimental results show that the proposed algorithm can improve the time performance greatly compared with Apriori algorithm. © 2016, Editorial Board of Jilin University. All right reserved.
引用
收藏
页码:1216 / 1221
页数:5
相关论文
共 14 条
  • [1] Schonberger V.M., Cukier K., Big Data, (2013)
  • [2] Agrawal R., Srikant R., Fast algorithms for mining association rules, Proc of the Int'l Conf on Very Large Data Bases, pp. 487-499, (1994)
  • [3] Han J., Pei J., Yin Y., Mining frequent patterns without candidate generation, Proc of the Int'l Conf on Management of Data, pp. 1-12, (2000)
  • [4] Park J.S., Chen M.S., Yu P.S., An effective hash-based algorithm for mining association rules, Proc of the Int'l Conf on Management of data, pp. 175-186, (1995)
  • [5] Brin S., Motwani R., Ullman J.D., Et al., Dynamic itemset counting and implication rules for market basket analysis, Proc of the Int'l Conf on Management of Data, pp. 255-264, (1997)
  • [6] Geerts F., Goethals B., Bussche J., A tight upper bound on the number of candidate patterns, Proc of the Int'l Conf on Data mining, pp. 155-162, (2001)
  • [7] Li X.-F., Yuan S.-M., Dong L.-Y., Et al., A data mining algorithm based on calculating multi-segment support, Chinese Journal of Computers, 24, 6, pp. 661-665, (2001)
  • [8] Wang H.-M., Hu M., Frequent itemsets grouping algorithm based on hash, Journal of Computer Applications, 33, 11, pp. 47-51, (2013)
  • [9] Pei J., Han J., Lakshmanan L.V.S., Mining frequent itemsets with convertible constraints, Proc of the Int'l Conf on Data Engineering, pp. 324-332, (2001)
  • [10] Liu J., Pan Y., Wang K., Et al., Mining frequent item sets by opportunistic projection, Proc of the Int'l Conf on Knowledge Discovery in Databases, pp. 23-32, (2002)