A novel approach for hiding sensitive utility and frequent itemsets

被引:7
|
作者
Liu, Xuan [1 ]
Xu, Feng [1 ]
Lv, Xin [1 ]
机构
[1] Hohai Univ, Coll Comp & Informat, Nanjing 211100, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Sensitive utility and frequent itemsets; sanitization; side effects; maximum boundary value; ASSOCIATION RULES; KNOWLEDGE; ALGORITHMS;
D O I
10.3233/IDA-173613
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data is shared among different organizations for mutual benefit. Data mining techniques are utilized to discover valuable knowledge for decision-making. However, data mining poses a threat to disclose the sensitive information. Thus, the sensitive knowledge should be concealed before releasing data. The pervious works either address the association rule or utility itemsets hiding problem. This paper focuses on preserving the sensitive utility and frequent itemsets, and a sanitization approach named HUFI is presented. The sensitive itemsets are hidden by reducing their support or utility below the minimum thresholds. For a sensitive itemset, the concept of maximum boundary value is introduced to determine the hidden strategy. Then, a transaction supporting minimal number of non-sensitive itemsets is selected to be sanitized. In such a transaction, a weight is assigned to each item contained in the sensitive itemset, and an item with the highest weight is selected to be modified. We compared HUFI with the state of the art algorithms on various databases. The experiment results show that HUFI outperforms the other algorithms in minimizing the side effects on non-sensitive knowledge and maintaining the database quality after the sanitization process. In addition, the impact of database density on sanitization approaches is observed.
引用
收藏
页码:1259 / 1278
页数:20
相关论文
共 50 条
  • [41] A Combination Approach to Frequent Itemsets Mining
    Sahaphong, Supatra
    Boonjing, Veera
    Third 2008 International Conference on Convergence and Hybrid Information Technology, Vol 1, Proceedings, 2008, : 565 - 570
  • [42] Text Clustering Using Frequent Weighted Utility Itemsets
    Tram Tran
    Bay Vo
    Tho Thi Ngoc Le
    Ngoc Thanh Nguyen
    CYBERNETICS AND SYSTEMS, 2017, 48 (03) : 193 - 209
  • [43] A Hybrid Approach for Mining Frequent Itemsets
    Bay Vo
    Tuong Le
    Coenen, Frans
    Hong, Tzung-Pei
    2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 4647 - 4651
  • [44] A novel efficient bi-objective evolutionary algorithm for frequent and high utility itemsets mining
    Ma, Li
    Li, Chongyang
    Lu, Heng-yang
    Fang, Wei
    Lin, Jerry Chun-Wei
    MEMETIC COMPUTING, 2025, 17 (01)
  • [45] Two new techniques for hiding sensitive itemsets and their empirical evaluation
    HajYasien, Ahmed
    Estivill-Castro, Vladimir
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 302 - 311
  • [46] A NOVEL APPROACH TO EXTRACT HIGH UTILITY ITEMSETS FROM DISTRIBUTED DATABASES
    Subramanian, Kannimuthu
    Kandhasamy, Premalatha
    Subramanian, Shankar
    COMPUTING AND INFORMATICS, 2012, 31 (06) : 1597 - 1615
  • [47] Mining frequent weighted utility itemsets in hierarchical quantitative databases
    Nguyen, Ham
    Le, Tuong
    Nguyen, Minh
    Fournier-Viger, Philippe
    Tseng, Vincent S. S.
    Vo, Bay
    KNOWLEDGE-BASED SYSTEMS, 2022, 237
  • [48] FSKYMINE: A Faster Algorithm For Mining Skyline Frequent Utility Itemsets
    Hung Manh Nguyen
    Anh Viet Phan
    Lai Van Pham
    PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 251 - 255
  • [49] Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases
    Reddy, P. P. C.
    Kiran, R. Uday
    Zettsu, Koji
    Toyoda, Masashi
    Reddy, P. Krishna
    Kitsuregawa, Masaru
    BIG DATA ANALYTICS (BDA 2019), 2019, 11932 : 287 - 306
  • [50] A novel incremental algorithm for mining frequent itemsets
    Wang, YL
    Li, ZZ
    Xue, J
    Ban, SM
    Zhao, YL
    DCABES 2002, PROCEEDING, 2002, : 60 - 64