A Top-Down k-Anonymization Implementation for Apache Spark

被引:0
|
作者
Sopaoglu, Ugur [1 ]
Abul, Osman [1 ]
机构
[1] TOBB Univ Econ & Technol, Dept Comp Engn, Ankara, Turkey
关键词
k-anonymity; top-down specialization; big data; hadoop mapreduce; apache spark; ANONYMITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data science continues to evolve with each passing day and upgrades itself according to the exponentially increasing amount of data. The progression provides convenience to extract meaningful information from the huge amount of data from various domains including individual, public health, micro-blogging and sensors. The ability to process huge volume of data and to extract valuable information sometimes scare people especially when individual sensitive data is concerned. Many data privacy-preserving techniques are developed to overcome these fears. Over the years, these techniques are adapted to meet emerging type and increasing volume of data. For instance, to cope with today's big data we need more scalable and efficient methods. Big data platforms like Apache Hadoop and Apache Spark are highly utilized for this purpose. In this paper we study k-anonymization problem in the context of big data and develop a top-down specialization anonymization solution for Apache Spark platform. An extensive experimental evaluation has been carried out and the efficiency results are presented.
引用
收藏
页码:4513 / 4521
页数:9
相关论文
共 50 条
  • [1] Balanced k-Anonymization
    Al-Fedaghi, Sabah S.
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 6, 2005, : 179 - 182
  • [2] K-anonymization revisited
    Gionis, Aristides
    Mazza, Arnon
    Tassa, Tamir
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 744 - +
  • [3] On Distributed k-Anonymization
    Zhong, Sheng
    [J]. FUNDAMENTA INFORMATICAE, 2009, 92 (04) : 411 - 431
  • [4] Thoughts on k-anonymization
    Nergiz, M. Ercan
    Clifton, Chris
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 63 (03) : 622 - 645
  • [5] SPARK-Based Partitioning Algorithm for k-Anonymization of Large RDFs
    Temuujin, Odsuren
    Jeon, Minhyuk
    Seo, Kwangwon
    Ahn, Jinhyun
    Im, Dong-Hyuk
    [J]. ADVANCED MULTIMEDIA AND UBIQUITOUS ENGINEERING, 2020, 590 : 292 - 298
  • [6] A Hybrid Method for k-Anonymization
    Lin, Jun-Lin
    Wei, Meng-Cheng
    Li, Chih-Wen
    Hsieh, Kuo-Chiang
    [J]. 2008 IEEE ASIA-PACIFIC SERVICES COMPUTING CONFERENCE, VOLS 1-3, PROCEEDINGS, 2008, : 385 - 390
  • [7] Efficient k-anonymization for privacy preservation
    Liang, Z.
    Wei, R.
    [J]. PROCEEDINGS OF THE 2008 12TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, VOLS I AND II, 2008, : 737 - 742
  • [8] Flexible sensitive K-anonymization on transactions
    Yu-Chuan Tsai
    Shyue-Liang Wang
    I-Hsien Ting
    Tzung-Pei Hong
    [J]. World Wide Web, 2020, 23 : 2391 - 2406
  • [9] How to find an appropriate k for k-anonymization
    Kiyomoto, Shinsaku
    Miyake, Yutaka
    [J]. 2014 EIGHTH INTERNATIONAL CONFERENCE ON INNOVATIVE MOBILE AND INTERNET SERVICES IN UBIQUITOUS COMPUTING (IMIS), 2014, : 273 - 279
  • [10] k-anonymization with minimal loss of information
    Gionis, Aristides
    Tassa, Tamir
    [J]. ALGORITHMS - ESA 2007, PROCEEDINGS, 2007, 4698 : 439 - +