A novel approach for hiding sensitive utility and frequent itemsets

被引：7

作者：

Liu, Xuan ^{[1
]}

Xu, Feng ^{[1
]}

Lv, Xin ^{[1
]}

机构：

[1] Hohai Univ, Coll Comp & Informat, Nanjing 211100, Jiangsu, Peoples R China

来源：

INTELLIGENT DATA ANALYSIS | 2018年 / 22卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Sensitive utility and frequent itemsets; sanitization; side effects; maximum boundary value; ASSOCIATION RULES; KNOWLEDGE; ALGORITHMS;

D O I：

10.3233/IDA-173613

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Data is shared among different organizations for mutual benefit. Data mining techniques are utilized to discover valuable knowledge for decision-making. However, data mining poses a threat to disclose the sensitive information. Thus, the sensitive knowledge should be concealed before releasing data. The pervious works either address the association rule or utility itemsets hiding problem. This paper focuses on preserving the sensitive utility and frequent itemsets, and a sanitization approach named HUFI is presented. The sensitive itemsets are hidden by reducing their support or utility below the minimum thresholds. For a sensitive itemset, the concept of maximum boundary value is introduced to determine the hidden strategy. Then, a transaction supporting minimal number of non-sensitive itemsets is selected to be sanitized. In such a transaction, a weight is assigned to each item contained in the sensitive itemset, and an item with the highest weight is selected to be modified. We compared HUFI with the state of the art algorithms on various databases. The experiment results show that HUFI outperforms the other algorithms in minimizing the side effects on non-sensitive knowledge and maintaining the database quality after the sanitization process. In addition, the impact of database density on sanitization approaches is observed.

引用

页码：1259 / 1278

页数：20

共 50 条

[41] A Combination Approach to Frequent Itemsets Mining
Sahaphong, Supatra
Boonjing, Veera
Third 2008 International Conference on Convergence and Hybrid Information Technology, Vol 1, Proceedings, 2008, : 565 - 570
[42] Text Clustering Using Frequent Weighted Utility Itemsets
Tram Tran
Bay Vo
Tho Thi Ngoc Le
Ngoc Thanh Nguyen
CYBERNETICS AND SYSTEMS, 2017, 48 (03) : 193 - 209
[43] A Hybrid Approach for Mining Frequent Itemsets
Bay Vo
Tuong Le
Coenen, Frans
Hong, Tzung-Pei
2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 4647 - 4651
[44] A novel efficient bi-objective evolutionary algorithm for frequent and high utility itemsets mining
Ma, Li
Li, Chongyang
Lu, Heng-yang
Fang, Wei
Lin, Jerry Chun-Wei
MEMETIC COMPUTING, 2025, 17 (01)
[45] Two new techniques for hiding sensitive itemsets and their empirical evaluation
HajYasien, Ahmed
Estivill-Castro, Vladimir
DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 302 - 311
[46] A NOVEL APPROACH TO EXTRACT HIGH UTILITY ITEMSETS FROM DISTRIBUTED DATABASES
Subramanian, Kannimuthu
Kandhasamy, Premalatha
Subramanian, Shankar
COMPUTING AND INFORMATICS, 2012, 31 (06) : 1597 - 1615
[47] Mining frequent weighted utility itemsets in hierarchical quantitative databases
Nguyen, Ham
Le, Tuong
Nguyen, Minh
Fournier-Viger, Philippe
Tseng, Vincent S. S.
Vo, Bay
KNOWLEDGE-BASED SYSTEMS, 2022, 237
[48] FSKYMINE: A Faster Algorithm For Mining Skyline Frequent Utility Itemsets
Hung Manh Nguyen
Anh Viet Phan
Lai Van Pham
PROCEEDINGS OF 2019 6TH NATIONAL FOUNDATION FOR SCIENCE AND TECHNOLOGY DEVELOPMENT (NAFOSTED) CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2019, : 251 - 255
[49] Discovering Spatial High Utility Frequent Itemsets in Spatiotemporal Databases
Reddy, P. P. C.
Kiran, R. Uday
Zettsu, Koji
Toyoda, Masashi
Reddy, P. Krishna
Kitsuregawa, Masaru
BIG DATA ANALYTICS (BDA 2019), 2019, 11932 : 287 - 306
[50] A novel incremental algorithm for mining frequent itemsets
Wang, YL
Li, ZZ
Xue, J
Ban, SM
Zhao, YL
DCABES 2002, PROCEEDING, 2002, : 60 - 64

← 1 2 3 4 5 →