A hybrid approach for scalable sub-tree anonymization over big data using Map Reduce on cloud

被引:72
|
作者
Zhang, Xuyun [1 ]
Liu, Chang [1 ]
Nepal, Surya [2 ]
Yang, Chi [1 ]
Dou, Wanchun [3 ]
Chen, Jinjun [1 ]
机构
[1] Univ Technol Sydney, Fac Engn & Informat Technol, Broadway, NSW 2007, Australia
[2] CSIRO, Ctr Informat & Commun Technol, Marsfield, NSW 2122, Australia
[3] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210093, Jiangsu, Peoples R China
关键词
Big data; Cloud computing; Data anonymization; Privacy preservation; MapReduce;
D O I
10.1016/j.jcss.2014.02.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In big data applications, data privacy is one of the most concerned issues because processing large-scale privacy-sensitive data sets often requires computation resources provisioned by public cloud services. Sub-tree data anonymization is a widely adopted scheme to anonymize data sets for privacy preservation. Top-Down Specialization (TDS) and Bottom-Up Generalization (BUG) are two ways to fulfill sub-tree anonymization. However, existing approaches for sub-tree anonymization fall short of parallelization capability, thereby lacking scalability in handling big data in cloud. Still, either TDS or BUG individually suffers from poor performance for certain valuing of k-anonymity parameter. In this paper, we propose a hybrid approach that combines TDS and BUG together for efficient sub-tree anonymization over big data. Further, we design MapReduce algorithms for the two components (TDS and BUG) to gain high scalability. Experiment evaluation demonstrates that the hybrid approach significantly improves the scalability and efficiency of sub-tree anonymization scheme over existing approaches. (c) 2014 Elsevier Inc. All rights reserved.
引用
收藏
页码:1008 / 1020
页数:13
相关论文
共 50 条
  • [41] RETRACTED ARTICLE: Big data analytic diabetics using map reduce and classification techniques
    Ahmad Ali AlZubi
    The Journal of Supercomputing, 2020, 76 : 4328 - 4337
  • [42] Retraction Note: Big data analytic diabetics using map reduce and classification techniques
    Ahmad Ali AlZubi
    The Journal of Supercomputing, 2023, 79 : 5831 - 5831
  • [43] CSRS: Customized Service Recommendation System for Big Data Analysis using Map Reduce
    Bande, Vijay M.
    Pakle, Ganesh K.
    2016 INTERNATIONAL CONFERENCE ON INVENTIVE COMPUTATION TECHNOLOGIES (ICICT), VOL 3, 2015, : 857 - 859
  • [44] BIG DATA ANALYSIS FOR HEART DISEASE DETECTION SYSTEM USING MAP REDUCE TECHNIQUE
    Vaishali, G.
    Kalaivani, V.
    2016 INTERNATIONAL CONFERENCE ON COMPUTING TECHNOLOGIES AND INTELLIGENT DATA ENGINEERING (ICCTIDE'16), 2016,
  • [45] ON THE ARCHITECTURE OF A BIG DATA CLASSIFICATION TOOL BASED ON A MAP REDUCE APPROACH FOR HYPERSPECTRAL IMAGE ANALYSIS
    Ayma, V. A.
    Ferreira, R. S.
    Happ, P. N.
    Oliveira, D. A. B.
    Costa, G. A. O. P.
    Feitosa, R. Q.
    Plaza, A.
    Gamba, P.
    2015 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2015, : 1508 - 1511
  • [46] Optimized hadoop map reduce system for strong analytics of cloud big product data on amazon web service
    Yang, Shengying
    Jin, Wuyin
    Yu, Yunxiang
    Hashim, Kamarul Faizal
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (03)
  • [47] RETRACTED ARTICLE: The model for improving big data sub-image retrieval performance using scalable vocabulary tree based on predictive clustering
    Quan-Dong Feng
    Miao Xu
    Xin Zhang
    Cluster Computing, 2016, 19 : 699 - 708
  • [48] Retraction Note to: The model for improving big data sub-image retrieval performance using scalable vocabulary tree based on predictive clustering
    Quan-Dong Feng
    Miao Xu
    Xin Zhang
    Cluster Computing, 2019, 22 : 10397 - 10397
  • [49] Architecture and Implementation of a Scalable Sensor Data Storage and Analysis System Using Cloud Computing and Big Data Technologies
    Aydin, Galip
    Hallac, Ibrahim Riza
    Karakus, Betul
    JOURNAL OF SENSORS, 2015, 2015
  • [50] A Context-Aware Service Evaluation Approach over Big Data for Cloud Applications
    Qi, Lianyong
    Dou, Wanchun
    Hu, Chunhua
    Zhou, Yuming
    Yu, Jiguo
    IEEE TRANSACTIONS ON CLOUD COMPUTING, 2020, 8 (02) : 338 - 348