A hybrid multi-group approach for privacy-preserving data mining

被引:17
|
作者
Teng, Zhouxuan [1 ]
Du, Wenliang [1 ]
机构
[1] Syracuse Univ, Dept Elect Engn & Comp Sci, Syracuse, NY 13244 USA
关键词
Privacy; SMC; Randomization; Hybrid;
D O I
10.1007/s10115-008-0158-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a hybrid multi-group approach for privacy preserving data mining. We make two contributions in this paper. First, we propose a hybrid approach. Previous work has used either the randomization approach or the secure multi-party computation (SMC) approach. However, these two approaches have complementary features: the randomization approach is much more efficient but less accurate, while the SMC approach is less efficient but more accurate. We propose a novel hybrid approach, which takes advantage of the strength of both approaches to balance the accuracy and efficiency constraints. Compared to the two existing approaches, our proposed approach can achieve much better accuracy than randomization approach and much reduced computation cost than SMC approach. We also propose a multi-group scheme that makes it flexible for the data miner to control the balance between data mining accuracy and privacy. This scheme is motivated by the fact that existing randomization schemes that randomize data at individual attribute level can produce insufficient accuracy when the number of dimensions is high. We partition attributes into groups, and develop a scheme to conduct group-based randomization to achieve better data mining accuracy. To demonstrate the effectiveness of the proposed general schemes, we have implemented them for the ID3 decision tree algorithm and association rule mining problem and we also present experimental results.
引用
收藏
页码:133 / 157
页数:25
相关论文
共 50 条
  • [21] A tree-based data perturbation approach for privacy-preserving data mining
    Li, Xiao-Bai
    Sarkar, Sumit
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (09) : 1278 - 1283
  • [22] Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining
    Zhu, Dan
    Li, Xiao-Bai
    Wu, Shuning
    DECISION SUPPORT SYSTEMS, 2009, 48 (01) : 133 - 140
  • [23] A tree-based data perturbation approach for privacy-preserving data mining
    IEEE Computer Society
    不详
    不详
    IEEE Trans Knowl Data Eng, 2006, 9 (1278-1283):
  • [24] Privacy-preserving data mining in the malicious model
    Kantarcioglu, Murat
    Kardes, Onur
    International Journal of Information and Computer Security, 2008, 2 (04) : 353 - 375
  • [25] Research on distributed privacy-preserving data mining
    Jia, Zhe
    Pang, Lei
    Luo, Shoushan
    Xin, Yang
    Zhang, Miao
    Journal of Convergence Information Technology, 2012, 7 (01) : 356 - 367
  • [26] Research on Privacy-Preserving Technology of Data Mining
    Shen, Yanguang
    Han, Junrui
    HuiShao
    ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL II, PROCEEDINGS, 2009, : 612 - 614
  • [27] A crypto-based approach to privacy-preserving collaborative data mining
    Zhan, Justin
    Matwin, Stan
    ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 546 - 550
  • [28] A DCT-based privacy-preserving approach for efficient data mining
    Tian, Feng
    Gui, Xiaolin
    An, Jian
    Yang, Pan
    Zhang, Xuejun
    Zhao, Jianqiang
    SECURITY AND COMMUNICATION NETWORKS, 2015, 8 (18) : 3641 - 3652
  • [29] Privacy-Preserving Data Mining for Smart Manufacturing
    Hu, Qianyu
    Chen, Ruimin
    Yang, Hui
    Kumara, Soundar
    SMART AND SUSTAINABLE MANUFACTURING SYSTEMS, 2020, 4 (02): : 99 - 120
  • [30] Privacy-preserving data mining in electronic surveys
    Zhan, Justin
    Matwin, Stan
    International Journal of Network Security, 2007, 4 (03) : 318 - 327