A hybrid approach for scalable sub-tree anonymization over big data using Map Reduce on cloud

被引:72
|
作者
Zhang, Xuyun [1 ]
Liu, Chang [1 ]
Nepal, Surya [2 ]
Yang, Chi [1 ]
Dou, Wanchun [3 ]
Chen, Jinjun [1 ]
机构
[1] Univ Technol Sydney, Fac Engn & Informat Technol, Broadway, NSW 2007, Australia
[2] CSIRO, Ctr Informat & Commun Technol, Marsfield, NSW 2122, Australia
[3] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210093, Jiangsu, Peoples R China
关键词
Big data; Cloud computing; Data anonymization; Privacy preservation; MapReduce;
D O I
10.1016/j.jcss.2014.02.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In big data applications, data privacy is one of the most concerned issues because processing large-scale privacy-sensitive data sets often requires computation resources provisioned by public cloud services. Sub-tree data anonymization is a widely adopted scheme to anonymize data sets for privacy preservation. Top-Down Specialization (TDS) and Bottom-Up Generalization (BUG) are two ways to fulfill sub-tree anonymization. However, existing approaches for sub-tree anonymization fall short of parallelization capability, thereby lacking scalability in handling big data in cloud. Still, either TDS or BUG individually suffers from poor performance for certain valuing of k-anonymity parameter. In this paper, we propose a hybrid approach that combines TDS and BUG together for efficient sub-tree anonymization over big data. Further, we design MapReduce algorithms for the two components (TDS and BUG) to gain high scalability. Experiment evaluation demonstrates that the hybrid approach significantly improves the scalability and efficiency of sub-tree anonymization scheme over existing approaches. (c) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:1008 / 1020
页数:13
相关论文
共 50 条
  • [11] A Scalable Two-Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud
    Zhang, Xuyun
    Yang, Laurence T.
    Liu, Chang
    Chen, Jinjun
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (02) : 363 - 373
  • [12] Unstructured Data Analysis on Big Data using Map Reduce
    Subramaniyaswamy, V
    Vijayakumar, V.
    Logesh, R.
    Indragandhi, V
    BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 456 - 465
  • [13] Empirical Evaluation of Map Reduce Based Hybrid Approach for Problem of Imbalanced Classification in Big Data
    Ahlawat, Khyati
    Chug, Anuradha
    Singh, Amit Prakash
    INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2019, 11 (03) : 23 - 45
  • [14] CLASSIFICATION ALGORITHMS FOR BIG DATA ANALYSIS, A MAP REDUCE APPROACH
    Ayma, V. A.
    Ferreira, R. S.
    Happ, P.
    Oliveira, D.
    Feitosaa, R.
    Costa, G.
    Plaza, A.
    Gamba, P.
    PIA15+HRIGI15 - JOINT ISPRS CONFERENCE, VOL. I, 2015, 40-3 (W2): : 17 - 21
  • [15] Improved l-diversity: Scalable anonymization approach for Privacy Preserving Big Data Publishing
    Mehta, Brijesh B.
    Rao, Udai Pratap
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (04) : 1423 - 1430
  • [16] Optimizing healthcare big data privacy with scalable subtree-based L-Anonymization in cloud environments
    Aravindhraj Natarajan
    N. Shanthi
    Wireless Networks, 2025, 31 (3) : 2727 - 2742
  • [17] Hybrid classifier model for big data by leveraging map reduce framework
    Sitharamulu, V.
    Prasad, K. Rajendra
    Reddy, K. Sudheer
    Prasad, A. V. Krishna
    Dass, M. Venkat
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2024, 16 (01) : 23 - 48
  • [18] Proximity-Aware Local-Recoding Anonymization with MapReduce for Scalable Big Data Privacy Preservation in Cloud
    Zhang, Xuyun
    Dou, Wanchun
    Pei, Jian
    Nepal, Surya
    Yang, Chi
    Liu, Chang
    Chen, Jinjun
    IEEE TRANSACTIONS ON COMPUTERS, 2015, 64 (08) : 2293 - 2307
  • [19] Privacy Preserving Big data Using Combine Anonymization and Encryption Approach
    Desai, Vidhi
    Chauhan, Gargi K.
    2019 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2019,
  • [20] Handling Big Data Efficiently by using Map Reduce Technique
    Maitrey, Seema
    Jha, C. K.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION TECHNOLOGY CICT 2015, 2015, : 703 - 708