Big Data Anonymization with Spark

被引:0
|
作者
Canbay, Yavuz [1 ]
Sagiroglu, Seref [1 ]
机构
[1] Gazi Univ, Fac Engn, Dept Comp Engn, Ankara, Turkey
关键词
big data; anonymization; privacy preserving; hadoop; spark; model; review; PRIVACY;
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Privacy is an important issue for big data including sensitive attributes. In the case of directly sharing or publishing these data, privacy breach occurs. In order to overcome this problem, previous studies were focused on developing big data anonymization techniques on Hadoop environment. When compared to Hadoop, Spark facilitates to develop faster applications with the help of keeping data in memory instead of hard disk. Despite a number of projects were developed on Hadoop, now this trend is shifting to Spark. In addition, the problem of anonymizing big data streams for realtime applications can be solved with Spark technology. Hence to sum up, Spark is the main technology facilitates developing both faster anonymization applications and big data stream anonymization solutions. In this study, anonymization techniques, big data technologies and privacy preserving big data publishing was reviewed and a big data anonymization model based on Spark was proposed for the first time. It is expected that the proposed model might help to researchers to solve big data privacy issues and also provide solutions for new generation privacy violations problems.
引用
收藏
页码:833 / 838
页数:6
相关论文
共 50 条
  • [1] Big Data Privacy and Anonymization
    Torra, Vicenc
    Navarro-Arribas, Guillermo
    [J]. PRIVACY AND IDENTITY MANAGEMENT: FACING UP TO NEXT STEPS, 2016, 498 : 15 - 26
  • [2] Anonymization in the Time of Big Data
    Domingo-Ferrer, Josep
    Soria-Comas, Jordi
    [J]. PRIVACY IN STATISTICAL DATABASES: UNESCO CHAIR IN DATA PRIVACY, 2016, 9867 : 57 - 68
  • [3] Anonymization in the time of big data
    Domingo-Ferrer, Josep
    Soria-Comas, Jordi
    [J]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, 9867 LNCS : 57 - 68
  • [4] Efficient multimedia big data anonymization
    Jang, Sung-Bong
    Ko, Young-Woong
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (17) : 17855 - 17872
  • [5] In-Situ Anonymization of Big Data
    Krizan, Tomislav
    Brakus, Marko
    Vukelic, Davorin
    [J]. 2015 8TH INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2015, : 292 - 298
  • [6] Efficient multimedia big data anonymization
    Sung-Bong Jang
    Young-Woong Ko
    [J]. Multimedia Tools and Applications, 2017, 76 : 17855 - 17872
  • [7] Personal Big Data, GDPR and Anonymization
    Domingo-Ferrer, Josep
    [J]. FLEXIBLE QUERY ANSWERING SYSTEMS, 2019, 11529 : 7 - 10
  • [8] Data anonymization evaluation for big data and IoT environment
    Ni, Chunchun
    Cang, Li Shan
    Gope, Prosanta
    Min, Geyong
    [J]. INFORMATION SCIENCES, 2022, 605 : 381 - 392
  • [9] A Study of Performance Enhancement in Big Data Anonymization
    Jang, Sung-Bong
    [J]. 2017 4TH INTERNATIONAL CONFERENCE ON COMPUTER APPLICATIONS AND INFORMATION PROCESSING TECHNOLOGY (CAIPT), 2017,
  • [10] A Clustering Based Anonymization Model for Big Data
    Canbay, Yavuz
    Kalyoncu, Aydincan
    Ercimen, Mucahid
    Dogan, Adem
    Sagiroglu, Seref
    [J]. 2019 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2019, : 720 - 725