Private Search Over Big Data Leveraging Distributed File System and Parallel Processing

被引:0
|
作者
Selcuk, Ayse [1 ]
Orencik, Cengiz [1 ]
Savas, Erkay [1 ]
机构
[1] Sabanci Univ, Fac Engn & Nat Sci, Istanbul, Turkey
关键词
Cloud computing; Big Data; Keyword Search; Privacy; Hadoop; ENCRYPTION; MAPREDUCE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we identify the security and privacy problems associated with a certain Big Data application, namely secure keyword-based search over encrypted cloud data and emphasize the actual challenges and technical difficulties in the Big Data setting. More specifically, we provide definitions from which privacy requirements can be derived. In addition, we adapt an existing work on privacy-preserving keyword-based search method to the Big Data setting, in which, not only data is huge but also changing and accumulating very fast. Our proposal is scalable in the sense that it can leverage distributed file systems and parallel programming techniques such as the Hadoop Distributed File System (HDFS) and the MapReduce programming model, to work with very large data sets. We also propose a lazy idf-updating method that can efficiently handle the relevancy scores of the documents in a dynamically changing, large data set. We empirically show the efficiency and accuracy of the method through an extensive set of experiments on real data.
引用
收藏
页码:116 / 121
页数:6
相关论文
共 50 条
  • [31] DIFTSAS: a DIstributed Full Text Search and Analysis System for Big Data
    Li, Bo
    Zhang, Jingjie
    Chen, Mingyu
    Zhang, JinChao
    Wang, Kunpeng
    Meng, Dan
    [J]. 2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1303 - 1309
  • [32] Seismic Hazard Visualization from Big Simulation Data: Construction of a Parallel Distributed Processing System for Ground Motion Simulation Data
    Maeda, Takahiro
    Fujiwara, Hiroyuki
    [J]. JOURNAL OF DISASTER RESEARCH, 2016, 11 (02) : 265 - 271
  • [33] Leveraging Distributed Big Data Storage Support in CLAaaS for WINGS Workflow Management System
    Alghamdi, Hadeel
    Zulkernine, Farhana
    Martin, Patrick
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2426 - 2432
  • [34] Parallel Processing of Big Data using Power Iteration Clustering over MapReduce
    Jayalatchumy, D.
    Thambidurai, P.
    Alamelu, A. Vasumathi
    [J]. 2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014), 2014, : 176 - 178
  • [35] Big Data Performance Analysis on a Hadoop Distributed File System Based on Geometric Data Perturbation Technique
    Marichamy, V. Santhana
    Natarajan, V.
    [J]. 2ND INTERNATIONAL CONFERENCE ON RECENT TRENDS IN ADVANCED COMPUTING ICRTAC -DISRUP - TIV INNOVATION , 2019, 2019, 165 : 415 - 420
  • [36] A high-performance distributed parallel file system for data-intensive computations
    Shen, XH
    Choudhary, A
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2004, 64 (10) : 1157 - 1167
  • [37] Efficient Utilization of Big Data using Distributed Storage, Parallel Processing, and Blockchain Technology
    Giuliano, Alessandro
    Hilal, Waleed
    Alsadi, Naseem
    Surucu, Onur
    Gadsden, S. Andrew
    Yawney, John
    Ziada, Youssef
    [J]. BIG DATA IV: LEARNING, ANALYTICS, AND APPLICATIONS, 2022, 12097
  • [38] Distributed and Scalable Directory Service in a Parallel File System
    Wang, Lixin
    Lu, Yutong
    Zhang, Wei
    Lei, Yan
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2016, E99D (02): : 313 - 323
  • [39] Distributed Metadata Management for Exascale Parallel File System
    Yamamoto, Keiji
    Hori, Atushi
    Ishikawa, Yutaka
    [J]. 2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 1438 - 1438
  • [40] Parallel and distributed computing for Big Data applications
    Senger, Hermes
    Geyer, Claudio
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (08): : 2412 - 2415