Efficient mining of cross-level high-utility itemsets in taxonomy quantitative databases

被引:19
|
作者
Tung, N. T. [1 ]
Nguyen, Loan T. T. [2 ,3 ]
Nguyen, Trinh D. D. [4 ]
Fourier-Viger, Philippe [5 ]
Nguyen, Ngoc-Thanh [6 ]
Vo, Bay [1 ]
机构
[1] HUTECH Univ, Fac Informat Technol, Ho Chi Minh City, Vietnam
[2] Int Univ, Sch Comp Sci & Engn, Ho Chi Minh City, Vietnam
[3] Vietnam Natl Univ, Ho Chi Minh City, Vietnam
[4] Ind Univ Ho Chi Minh City, Fac Informat Technol, Ho Chi Minh City, Vietnam
[5] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
[6] Wroclaw Univ Sci & Technol, Dept Appl Informat, Wroclaw, Poland
关键词
Cross-level itemsets; High-utility itemsets; Taxonomy; Hierarchical database; Data mining; ASSOCIATION RULES; ALGORITHM; PATTERNS; DISCOVERY;
D O I
10.1016/j.ins.2021.12.017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In contrast to frequent itemset mining (FIM) algorithms that focus on identifying itemsets with high occurrence frequency, high-utility itemset mining algorithms can reveal the most profitable sets of items in transaction databases. Several algorithms were proposed to perform the task efficiently. Nevertheless, most of them ignore the item categorizations. This useful information is provided in many real-world transaction databases. Previous works, such as CLH-Miner and ML-HUI Miner were proposed to solve this limitation to dis-cover cross-level and multi-level HUIs. However, the CLH-Miner has a long runtime and high memory usage. To address these drawbacks, this study extends tight upper bounds to propose effective pruning strategies. A novel algorithm named FEACP (Fast and Efficient Algorithm for Cross-level high-utility Pattern mining) is introduced, which adopts the proposed strategies to efficiently identify cross-level HUIs in taxonomy-based data-bases. It can be seen from a thorough performance evaluation that FEACP can identify use-ful itemsets of different abstraction levels in transaction databases with high efficiency, that is up to 8 times faster than the state-of-the-art algorithm on the tested sparse data-bases and up to 177 times on the tested dense databases. FEACP reduces memory usage by up to half over the CLH-Miner algorithm.(c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:41 / 62
页数:22
相关论文
共 50 条
  • [1] Efficient algorithms for mining high-utility itemsets in uncertain databases
    Lin, Jerry Chun-Wei
    Gan, Wensheng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Tseng, Vincent S.
    KNOWLEDGE-BASED SYSTEMS, 2016, 96 : 171 - 187
  • [2] Mining high-utility itemsets in dynamic profit databases
    Nguyen, Loan T. T.
    Phuc Nguyen
    Nguyen, Trinh D. D.
    Vo, Bay
    Fournier-Viger, Philippe
    Tseng, Vincent S.
    KNOWLEDGE-BASED SYSTEMS, 2019, 175 : 130 - 144
  • [3] Mining Top-K constrained cross-level high-utility itemsets over data streams
    Meng Han
    Shujuan Liu
    Zhihui Gao
    Dongliang Mu
    Ang Li
    Knowledge and Information Systems, 2024, 66 : 2885 - 2924
  • [4] Mining Top-K constrained cross-level high-utility itemsets over data streams
    Han, Meng
    Liu, Shujuan
    Gao, Zhihui
    Mu, Dongliang
    Li, Ang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (05) : 2885 - 2924
  • [5] Efficient Mining of Uncertain Data for High-Utility Itemsets
    Lin, Jerry Chun-Wei
    Gan, Wensheng
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Tseng, Vincent S.
    WEB-AGE INFORMATION MANAGEMENT, PT I, 2016, 9658 : 17 - 30
  • [6] An efficient method for mining High-Utility itemsets from unstable negative profit databases
    Tung, N. T.
    Nguyen, Trinh D. D.
    Nguyen, Loan T. T.
    Vo, Bay
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [7] Mining On-shelf High-utility Quantitative Itemsets
    Chen, Lili
    Gan, Wensheng
    Lin, Qi
    Miao, Jinbao
    Chen, Chien-Ming
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5491 - 5500
  • [8] Efficient algorithms for mining maximal high-utility itemsets
    Nguyen, Trinh D. D.
    Quoc-Bao Vu
    Nguyen, Loan T. T.
    PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 428 - 433
  • [9] Efficient Mining of Short Periodic High-Utility Itemsets
    Lin, Jerry Chun-Wei
    Zhang, Jiexiong
    Fournier-Viger, Philippe
    Hong, Tzung-Pei
    Chen, Chien-Ming
    Su, Ja-Hwung
    2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 3083 - 3088
  • [10] OHUQI: Mining on-shelf high-utility quantitative itemsets
    Chen, Lili
    Gan, Wensheng
    Lin, Qi
    Huang, Shuqiang
    Chen, Chien-Ming
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (06): : 8321 - 8345