Fast algorithm for high utility pattern mining with the sum of item quantities

被引:35
|
作者
Ryang, Heungmo [1 ]
Yun, Unil [1 ]
Ryu, Keun Ho [2 ]
机构
[1] Sejong Univ, Dept Comp Engn, Seoul, South Korea
[2] Chungbuk Natl Univ, Dept Comp Sci, Cheongju, South Korea
基金
新加坡国家研究基金会;
关键词
Data mining; high utility patterns; single-pass tree construction; tree restructuring; utility mining; FREQUENT ITEMSETS;
D O I
10.3233/IDA-160811
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In frequent pattern mining, items are considered as having the same importance in a database and their occurrence are represented as binary values in transactions. In real-world databases, however, items not only have relative importance but also are represented as non-binary values in transactions. High utility pattern mining is one of the most essential issues in the pattern mining field, which recently emerged to address the limitation of frequent pattern mining. Meanwhile, tree construction with a single database scan is significant since a database scan is a time-consuming task. In utility mining, an additional database scan is necessary to identify actual high utility patterns from candidates. In this paper, we propose a novel tree structure, namely SIQ-Tree (Sum of Item Quantities), which captures database information through a single-pass. Moreover, a restructuring method is suggested with strategies for reducing overestimated utilities. The proposed algorithm can construct the SIQ-Tree with only a single scan and decrease the number of candidate patterns effectively with the reduced overestimation utilities, through which mining performance is improved. Experimental results show that our algorithm outperforms a state-of-the-art one in terms of runtime and the number of generated candidates with a similar memory usage.
引用
收藏
页码:395 / 415
页数:21
相关论文
共 50 条
  • [1] A Fast Algorithm for Mining High Utility Itemsets
    Shankar, S.
    Purusothaman, T.
    Jayanthi, S.
    Babu, Nishanth
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1459 - +
  • [2] High-utility pattern mining: A method for discovery of high-utility item sets
    Hu, Jianying
    Mojsilovic, Aleksandra
    PATTERN RECOGNITION, 2007, 40 (11) : 3317 - 3324
  • [3] An Efficient Algorithm for High Utility Sequential Pattern Mining
    Wang, Jun-Zhe
    Yang, Zong-Hua
    Huang, Jiun-Long
    FRONTIER AND INNOVATION IN FUTURE COMPUTING AND COMMUNICATIONS, 2014, 301 : 49 - 56
  • [4] Improved Strategy for High-Utility Pattern Mining Algorithm
    Wang, Le
    Wang, Shui
    Li, Haiyan
    Zhou, Chunliang
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [5] FDHUP: Fast algorithm for mining discriminative high utility patterns
    Lin, Jerry Chun-Wei
    Gan, Wensheng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Chao, Han-Chieh
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 51 (03) : 873 - 909
  • [6] FDHUP: Fast algorithm for mining discriminative high utility patterns
    Jerry Chun-Wei Lin
    Wensheng Gan
    Philippe Fournier-Viger
    Tzung-Pei Hong
    Han-Chieh Chao
    Knowledge and Information Systems, 2017, 51 : 873 - 909
  • [7] A fast algorithm for mining high average-utility itemsets
    Lin, Jerry Chun-Wei
    Ren, Shifeng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Su, Ja-Hwung
    Vo, Bay
    APPLIED INTELLIGENCE, 2017, 47 (02) : 331 - 346
  • [8] A fast algorithm for mining high average-utility itemsets
    Jerry Chun-Wei Lin
    Shifeng Ren
    Philippe Fournier-Viger
    Tzung-Pei Hong
    Ja-Hwung Su
    Bay Vo
    Applied Intelligence, 2017, 47 : 331 - 346
  • [9] Utility Pattern Mining Algorithm Based on Improved Utility Pattern Tree
    Xing, Shuning
    Liu, Fangai
    Wang, Jiwei
    Pang, Lin
    Xu, Zhenguo
    2015 8TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2015, : 258 - 261
  • [10] A New Algorithm of Mining High Utility Sequential Pattern in Streaming Data
    Tang, Huijun
    Liu, Yangguang
    Wang, Le
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2019, 12 (01) : 342 - 350