Private Search Over Big Data Leveraging Distributed File System and Parallel Processing

被引:0
|
作者
Selcuk, Ayse [1 ]
Orencik, Cengiz [1 ]
Savas, Erkay [1 ]
机构
[1] Sabanci Univ, Fac Engn & Nat Sci, Istanbul, Turkey
关键词
Cloud computing; Big Data; Keyword Search; Privacy; Hadoop; ENCRYPTION; MAPREDUCE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we identify the security and privacy problems associated with a certain Big Data application, namely secure keyword-based search over encrypted cloud data and emphasize the actual challenges and technical difficulties in the Big Data setting. More specifically, we provide definitions from which privacy requirements can be derived. In addition, we adapt an existing work on privacy-preserving keyword-based search method to the Big Data setting, in which, not only data is huge but also changing and accumulating very fast. Our proposal is scalable in the sense that it can leverage distributed file systems and parallel programming techniques such as the Hadoop Distributed File System (HDFS) and the MapReduce programming model, to work with very large data sets. We also propose a lazy idf-updating method that can efficiently handle the relevancy scores of the documents in a dynamically changing, large data set. We empirically show the efficiency and accuracy of the method through an extensive set of experiments on real data.
引用
收藏
页码:116 / 121
页数:6
相关论文
共 50 条
  • [1] Hadoop Distributed File System for Big data analysis
    Almansouri, Hatim Talal
    Masmoudi, Youssef
    [J]. PROCEEDINGS OF 2019 IEEE 4TH WORLD CONFERENCE ON COMPLEX SYSTEMS (WCCS' 19), 2019, : 257 - 261
  • [2] Introduction to distributed and parallel processing of big spatiotemporal data
    Shang, Shuo
    He, Bingsheng
    Wang, Lizhe
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 151 : 98 - 99
  • [3] High Performance and Fault Tolerant Distributed File System for Big Data Storage and Processing using Hadoop
    Sivaraman, E.
    Manickachezian, R.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING APPLICATIONS (ICICA 2014), 2014, : 32 - 36
  • [4] Distributed File System to Leverage Data Locality for Large-File Processing
    da Silva, Erico Correia
    Sato, Liria Matsumoto
    Midorikawa, Edson Toshimi
    [J]. ELECTRONICS, 2024, 13 (01)
  • [5] Parallel and Distributed Powerset Generation Using Big Data Processing
    Essa, Youssef M.
    El-Mahalawy, Ahmed
    Attiya, Gamal
    El-Sayed, Ayman
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2019, 33 (13) : 1133 - 1156
  • [6] GDedup: Distributed File System Level Deduplication for Genomic Big Data
    Bartus, Paul
    Arzuaga, Emmanuel
    [J]. 2018 IEEE INTERNATIONAL CONGRESS ON BIG DATA (IEEE BIGDATA CONGRESS), 2018, : 120 - 127
  • [7] Computer Performance Determination System Based on Big Data Distributed File
    Lu, Kong
    [J]. CYBER SECURITY INTELLIGENCE AND ANALYTICS, 2020, 928 : 877 - 884
  • [8] An approach for Big Data Security based on Hadoop Distributed File system
    Mahmoud, Hadeer
    Hegazy, Abdelfatah
    Khafagy, Mohamed H.
    [J]. PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMPUTER ENGINEERING (ITCE' 2018), 2018, : 109 - 114
  • [9] HDFSX: Big Data Distributed File System with Small Files Support
    EIKafrawy, Passent M.
    Sauber, Amr M.
    Hafez, Mohamed M.
    [J]. ICENCO 2016 - 2016 12TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO) - BOUNDLESS SMART SOCIETIES, 2016, : 131 - 135
  • [10] Key technology in distributed file system towards big data analysis
    [J]. Zhou, J. (zhoujiang@ncic.ac.cn), 1600, Science Press (51):