Fast algorithm for high utility pattern mining with the sum of item quantities

被引:35
|
作者
Ryang, Heungmo [1 ]
Yun, Unil [1 ]
Ryu, Keun Ho [2 ]
机构
[1] Sejong Univ, Dept Comp Engn, Seoul, South Korea
[2] Chungbuk Natl Univ, Dept Comp Sci, Cheongju, South Korea
基金
新加坡国家研究基金会;
关键词
Data mining; high utility patterns; single-pass tree construction; tree restructuring; utility mining; FREQUENT ITEMSETS;
D O I
10.3233/IDA-160811
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In frequent pattern mining, items are considered as having the same importance in a database and their occurrence are represented as binary values in transactions. In real-world databases, however, items not only have relative importance but also are represented as non-binary values in transactions. High utility pattern mining is one of the most essential issues in the pattern mining field, which recently emerged to address the limitation of frequent pattern mining. Meanwhile, tree construction with a single database scan is significant since a database scan is a time-consuming task. In utility mining, an additional database scan is necessary to identify actual high utility patterns from candidates. In this paper, we propose a novel tree structure, namely SIQ-Tree (Sum of Item Quantities), which captures database information through a single-pass. Moreover, a restructuring method is suggested with strategies for reducing overestimated utilities. The proposed algorithm can construct the SIQ-Tree with only a single scan and decrease the number of candidate patterns effectively with the reduced overestimation utilities, through which mining performance is improved. Experimental results show that our algorithm outperforms a state-of-the-art one in terms of runtime and the number of generated candidates with a similar memory usage.
引用
收藏
页码:395 / 415
页数:21
相关论文
共 50 条
  • [41] SQUIRE: Sequential pattern mining with quantities
    Kim, C
    Lim, JH
    Ng, R
    Shim, K
    20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 827 - 827
  • [42] Distributed Mining of High Utility Sequential Patterns with Negative Item Values
    Varma, Manoj
    Sumalatha, Saleti
    Reddy, Akhileshwar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 592 - 598
  • [43] HUOPM: High-Utility Occupancy Pattern Mining
    Gan, Wensheng
    Lin, Jerry Chun-Wei
    Fournier-Viger, Philippe
    Chao, Han-Chieh
    Yu, Philip S.
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (03) : 1195 - 1208
  • [44] SQUIRE: Sequential pattern mining with quantities
    Kim, Chulyun
    Lim, Jong-Hwa
    Ng, Raymond T.
    Shim, Kyuseok
    JOURNAL OF SYSTEMS AND SOFTWARE, 2007, 80 (10) : 1726 - 1745
  • [45] HAUOPM: High Average Utility Occupancy Pattern Mining
    Mathe John Kenny Kumar
    Dipti Rana
    Arabian Journal for Science and Engineering, 2024, 49 : 3397 - 3416
  • [46] Survey on High Utility Oriented Sequential Pattern Mining
    Parmar, Dhyanesh K.
    Rathod, Yagnik A.
    Patel, Mukesh M.
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2013, : 398 - 404
  • [47] High-Utility Pattern Mining in Hadoop Environments
    Wu, Jimmy Ming-Tai
    Wei, Min
    Srivastava, Gautam
    Lin, Jerry Chun-Wei
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 5421 - 5427
  • [48] Distributed and Parallel High Utility Sequential Pattern Mining
    Zihayat, Morteza
    Hu, Zane Zhenhua
    An, Aijun
    Hu, Yonggang
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 853 - 862
  • [49] Mining High Utility Partial Periodic Pattern by GPA
    Hong, Tzung-Pei
    Hsu, Jen-Hao
    Yang, Kung-Jiuan
    Lan, Guo-Cheng
    Lin, Jerry Chun-Wei
    Wang, Shyue-Liang
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 820 - 824
  • [50] HAUOPM: High Average Utility Occupancy Pattern Mining
    Kumar, Mathe John Kenny
    Rana, Dipti
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (03) : 3397 - 3416