Privacy-preserving data (stream) mining techniques and their impact on data mining accuracy: a systematic literature review

被引:0
|
作者
U. H. W. A. Hewage
R. Sinha
M. Asif Naeem
机构
[1] Auckland University of Technology,School of Engineering Computer and Mathematical Sciences
[2] National University of Computer and Emerging Sciences,Department of Computer Science
来源
关键词
Privacy-preserving data mining; Data streams; Accuracy-privacy trade-off; Data privacy;
D O I
暂无
中图分类号
学科分类号
摘要
This study investigates existing input privacy-preserving data mining (PPDM) methods and privacy-preserving data stream mining methods (PPDSM), including their strengths and weaknesses. A further analysis was carried out to determine to what extent existing PPDM/PPDSM methods address the trade-off between data mining accuracy and data privacy which is a significant concern in the area. The systematic literature review was conducted using data extracted from 104 primary studies from 5 reputed databases. The scope of the study was defined using three research questions and adequate inclusion and exclusion criteria. According to the results of our study, we divided existing PPDM methods into four categories: perturbation, non-perturbation, secure multi-party computation, and combinations of PPDM methods. These methods have different strengths and weaknesses concerning the accuracy, privacy, time consumption, and more. Data stream mining must face additional challenges such as high volume, high speed, and computational complexity. The techniques proposed for PPDSM are less in number than the PPDM. We categorized PPDSM techniques into three categories (perturbation, non-perturbation, and other). Most PPDM methods can be applied to classification, followed by clustering and association rule mining. It was observed that numerous studies have identified and discussed the accuracy-privacy trade-off. However, there is a lack of studies providing solutions to the issue, especially in PPDSM.
引用
收藏
页码:10427 / 10464
页数:37
相关论文
共 50 条
  • [21] Research on distributed privacy-preserving data mining
    Jia, Zhe
    Pang, Lei
    Luo, Shoushan
    Xin, Yang
    Zhang, Miao
    [J]. Journal of Convergence Information Technology, 2012, 7 (01) : 356 - 367
  • [22] Research on Privacy-Preserving Technology of Data Mining
    Shen, Yanguang
    Han, Junrui
    HuiShao
    [J]. ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL II, PROCEEDINGS, 2009, : 612 - 614
  • [23] Privacy-Preserving Data Mining for Smart Manufacturing
    Hu, Qianyu
    Chen, Ruimin
    Yang, Hui
    Kumara, Soundar
    [J]. SMART AND SUSTAINABLE MANUFACTURING SYSTEMS, 2020, 4 (02): : 99 - 120
  • [24] Privacy-preserving data mining in electronic surveys
    Zhan, Justin
    Matwin, Stan
    [J]. International Journal of Network Security, 2007, 4 (03) : 318 - 327
  • [25] Granular Computing in Privacy-Preserving Data Mining
    Zhan, Justin
    Lin, Tsau Young
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, VOLS 1 AND 2, 2008, : 86 - +
  • [26] Privacy-preserving data mining in electronic surveys
    Zhan, J
    Matwin, S
    [J]. SHAPING BUSINESS STRATEGY IN A NETWORKED WORLD, VOLS 1 AND 2, PROCEEDINGS, 2004, : 1179 - 1185
  • [27] Privacy-preserving data mining: Developments and directions
    Thuraisingham, B
    [J]. JOURNAL OF DATABASE MANAGEMENT, 2005, 16 (01) : 75 - 87
  • [28] Hybrid Transformation in Privacy-Preserving Data Mining
    Putri, Awalia W.
    Hira, Laksmiwati
    [J]. PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2016,
  • [29] Privacy-Preserving Data Publishing in Process Mining
    Rafiei, Majid
    van der Aalst, Wil M. P.
    [J]. BUSINESS PROCESS MANAGEMENT FORUM, BPM FORUM 2020, 2020, 392 : 122 - 138
  • [30] Privacy-preserving data mining of medical data using data separation-based techniques
    Gang, Kou
    Yi, Peng
    Yong, Shi
    Zhengxin, Chen
    [J]. Data Science Journal, 2007, 6 (SUPPL.)