A hybrid approach for scalable sub-tree anonymization over big data using Map Reduce on cloud

被引:72
|
作者
Zhang, Xuyun [1 ]
Liu, Chang [1 ]
Nepal, Surya [2 ]
Yang, Chi [1 ]
Dou, Wanchun [3 ]
Chen, Jinjun [1 ]
机构
[1] Univ Technol Sydney, Fac Engn & Informat Technol, Broadway, NSW 2007, Australia
[2] CSIRO, Ctr Informat & Commun Technol, Marsfield, NSW 2122, Australia
[3] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210093, Jiangsu, Peoples R China
关键词
Big data; Cloud computing; Data anonymization; Privacy preservation; MapReduce;
D O I
10.1016/j.jcss.2014.02.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In big data applications, data privacy is one of the most concerned issues because processing large-scale privacy-sensitive data sets often requires computation resources provisioned by public cloud services. Sub-tree data anonymization is a widely adopted scheme to anonymize data sets for privacy preservation. Top-Down Specialization (TDS) and Bottom-Up Generalization (BUG) are two ways to fulfill sub-tree anonymization. However, existing approaches for sub-tree anonymization fall short of parallelization capability, thereby lacking scalability in handling big data in cloud. Still, either TDS or BUG individually suffers from poor performance for certain valuing of k-anonymity parameter. In this paper, we propose a hybrid approach that combines TDS and BUG together for efficient sub-tree anonymization over big data. Further, we design MapReduce algorithms for the two components (TDS and BUG) to gain high scalability. Experiment evaluation demonstrates that the hybrid approach significantly improves the scalability and efficiency of sub-tree anonymization scheme over existing approaches. (c) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:1008 / 1020
页数:13
相关论文
共 50 条
  • [21] Addressing Big Data Problem Using Hadoop and Map Reduce
    Patel, Aditya B.
    Birla, Manashvi
    Nair, Ushma
    3RD NIRMA UNIVERSITY INTERNATIONAL CONFERENCE ON ENGINEERING (NUICONE 2012), 2012,
  • [22] Privacy Preserving with Modified Grey Wolf Optimization Over Big Data Using Optimal K Anonymization Approach
    Kumar, S. Sai
    Reddy, Anumala Reethika
    Krishna, B. Sivarama
    Rao, J. Nageswara
    Kiran, Ajmeera
    JOURNAL OF INTERCONNECTION NETWORKS, 2022, 22 (SUPP01)
  • [23] Restricted sub-tree learning to estimate an optimal dynamic treatment regime using observational data
    Speth, Kelly
    Wang, Lu
    STATISTICS IN MEDICINE, 2021, 40 (26) : 5796 - 5812
  • [24] A Fast Map-Reduce Algorithm for Burst Errors in Big Data Cloud Storage
    Qin, Xue
    Kelley, Brian
    Saedy, Mahdy
    2015 10TH SYSTEM OF SYSTEMS ENGINEERING CONFERENCE (SOSE), 2015, : 398 - 403
  • [25] Implementation of Image Processing System using Handover Technique with Map Reduce Based on Big Data in the Cloud Environment
    Ali, Mehraj
    Kumar, John
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2016, 13 (02) : 326 - 331
  • [26] Data anonymisation of vertically partitioned data using Map Reduce techniques on cloud
    Kalidoss, Thangaramya
    Sannasi, Ganapathy
    Lakshmanan, Sairamesh
    Kanagasabai, Kulothungan
    Kannan, Arputharaj
    INTERNATIONAL JOURNAL OF COMMUNICATION NETWORKS AND DISTRIBUTED SYSTEMS, 2018, 20 (04) : 519 - 531
  • [27] Scalable Local-Recoding Anonymization using Locality Sensitive Hashing for Big Data Privacy Preservation
    Zhang, Xuyun
    Leckie, Christopher
    Dou, Wanchun
    Chen, Jinjun
    Kotagiri, Ramamohanarao
    Salcic, Zoran
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 1793 - 1802
  • [28] Data Analyzing Using Map-Join-Reduce in Cloud Storage
    Bhardwaj, Ruchi
    Mishra, Neetesh
    Kumar, Rajiv
    2014 INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2014, : 370 - 373
  • [29] CloudProteoAnalyzer: scalable processing of big data from proteomics using cloud computing
    Li, Jiancheng
    Xiong, Yi
    Feng, Shichao
    Pan, Chongle
    Guo, Xuan
    BIOINFORMATICS ADVANCES, 2024, 4 (01):
  • [30] Privacy Preserving Big Data Publication On Cloud Using Mondrian Anonymization Techniques and Deep Neural Networks
    Andrew, J.
    Karthikeyan, J.
    Jebastin, Jeffy
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 722 - 727