Mining High Utility Itemsets Using Prefix Trees and Utility Vectors

被引:13
|
作者
Qu, Jun-Feng [1 ]
Fournier-Viger, Philippe [2 ]
Liu, Mengchi [3 ]
Hang, Bo [1 ]
Hu, Chunyang [1 ]
机构
[1] Hubei Univ Arts & Sci, Sch Comp Engn, Xiangyang 441053, Hubei, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518055, Guangdong, Peoples R China
[3] South China Normal Univ, Sch Comp Sci, Guangzhou Key Lab Big Data & Intelligent Educ, Guangzhou 510631, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
High utility itemset; mining algorithm; prefix tree; utility vector; ALGORITHM; GENERATION; DISCOVERY; PATTERNS;
D O I
10.1109/TKDE.2023.3256126
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
High utility itemsets can reveal combinations of items that have a high profit, expense, or importance. Mining high utility itemsets in a database with n items generally results in a huge search space, composed of 2(n )itemsets, and heavy utility calculations for the explored itemsets. Previous algorithms using prefix tree structures perform two phases, namely candidate generation and testing. To avoid generating candidate itemsets, one-phase algorithms use list or hyper-link structures and have been proven to be superior to two-phase algorithms. However, it should be noted that a prefix tree is still an efficient structure for itemset mining problems, and especially algorithms using prefix trees such as FP-Growth have shown excellent performance for mining frequent itemsets. This paper proposes Hamm, a High-performance AlgorithM for Mining high utility itemsets. Hamm employs a novel TV (prefix Tree and utility Vector) structure and mines high utility itemsets in one phase without candidate generation. We also develop an efficient optimization which is incorporated into Hamm as a component. Using prefix trees and utility vectors, Hamm outperforms state-of-the-art algorithms on various databases in experiments. Experimental results also show that the proposed optimization remarkably reduces the search space and speeds up Hamm.
引用
收藏
页码:10224 / 10236
页数:13
相关论文
共 50 条
  • [1] Utility-Oriented Gradual Itemsets Mining Using High Utility Itemsets Mining
    Fongue, Audrey
    Lonlac, Jerry
    Tsopze, Norbert
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2023, 2023, 14148 : 107 - 113
  • [2] Mining high utility itemsets
    Chan, R
    Yang, Q
    Shen, YD
    THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2003, : 19 - 26
  • [3] HIGH UTILITY ITEMSETS MINING
    Liu, Ying
    Li, Jianwei
    Liao, Wei-Keng
    Choudhary, Alok
    Shi, Yong
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2010, 9 (06) : 905 - 934
  • [4] Mining of high-utility itemsets with negative utility
    Singh, Kuldeep
    Shakya, Harish Kumar
    Singh, Abhimanyu
    Biswas, Bhaskar
    EXPERT SYSTEMS, 2018, 35 (06)
  • [5] Mining high utility itemsets using extended chain structure and utility machine
    Qu, Jun-Feng
    Fournier-Viger, Philippe
    Liu, Mengchi
    Hang, Bo
    Wang, Feng
    KNOWLEDGE-BASED SYSTEMS, 2020, 208
  • [6] Mining summarization of high utility itemsets
    Zhang, Xiong
    Deng, Zhi-Hong
    KNOWLEDGE-BASED SYSTEMS, 2015, 84 : 67 - 77
  • [7] Mining Local High Utility Itemsets
    Fournier-Viger, Philippe
    Zhang, Yimin
    Lin, Jerry Chun-Wei
    Fujita, Hamido
    Koh, Yun Sing
    DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA 2018), PT II, 2018, 11030 : 450 - 460
  • [8] Vertical Mining for High Utility Itemsets
    Song, Wei
    Liu, Yu
    Li, Jinhong
    2012 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC 2012), 2012, : 429 - 434
  • [9] High utility itemsets mining with negative utility value: A survey
    Singh, Kuldeep
    Singh, Shashank Sheshar
    Kumar, Aj Ay
    Biswas, Bhaskar
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (06) : 6551 - 6562
  • [10] Efficient mining of high-utility itemsets using multiple minimum utility thresholds
    Lin, Jerry Chun-Wei
    Gan, Wensheng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Zhan, Justin
    KNOWLEDGE-BASED SYSTEMS, 2016, 113 : 100 - 115