A Study of Resilient Distributed Datasets for Big Data System

被引:0
|
作者
Kim, Da-yeon [1 ]
Shin, Dong-ryeol [1 ]
机构
[1] Sungkyunkwan Univ, Coll Informat & Commun Engn, Suwon, South Korea
关键词
Big data software platform; Hadoop ecosystem; Bigdata service;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present the Resilient Distributed Dataset (RDD) abstraction, on which the rest of the rest of the dissertation builds a general-purpose cluster computing stack. RDDs extend the data flow programming model introduced by MapReduce, which is the most widely used model for large-scale data analysis today. we propose a new abstraction called resilient distributed datasets that gives users direct control of data sharing. RDDs are fault-tolerant, parallel data structures that let users explicitly store data on disk or in memory, control its partitioning, and manipulate it using a rich set of operators. They offer a simple and efficient programming interface that can capture both current specialized models and new applications.
引用
收藏
页码:290 / 293
页数:4
相关论文
共 50 条
  • [31] An approach for Big Data Security based on Hadoop Distributed File system
    Mahmoud, Hadeer
    Hegazy, Abdelfatah
    Khafagy, Mohamed H.
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN COMPUTER ENGINEERING (ITCE' 2018), 2018, : 109 - 114
  • [32] HDFSX: Big Data Distributed File System with Small Files Support
    EIKafrawy, Passent M.
    Sauber, Amr M.
    Hafez, Mohamed M.
    ICENCO 2016 - 2016 12TH INTERNATIONAL COMPUTER ENGINEERING CONFERENCE (ICENCO) - BOUNDLESS SMART SOCIETIES, 2016, : 131 - 135
  • [33] Smart Medical Big Data Platform Based on Distributed File System
    Cai, Yonghua
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 127 : 111 - 111
  • [34] Analysis of key technologies of distributed file system based on big data
    Junping, Zhou
    Acta Technica CSAV (Ceskoslovensk Akademie Ved), 2017, 62 (01): : 479 - 488
  • [35] Key technologies of a distributed and unstructured water resources big data system
    Dong, Yuan
    Xiao, D.
    Hu, BaoQing
    Zhang, ShiLun
    Liang, JiaHai
    Nong, GuoCai
    Liu, ZhiXian
    Zhao, RongYang
    Liu, MeiXing
    Xu, ZhenHua
    Tao, Jin
    Deng, Kai
    Zhou, Li
    Han, Xin
    DESALINATION AND WATER TREATMENT, 2018, 122 : 36 - 41
  • [36] DIFTSAS: a DIstributed Full Text Search and Analysis System for Big Data
    Li, Bo
    Zhang, Jingjie
    Chen, Mingyu
    Zhang, JinChao
    Wang, Kunpeng
    Meng, Dan
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1303 - 1309
  • [37] A Distributed Load-Based Big Data Security Management System
    Xie Ming
    Chen Zubin
    PROCEEDINGS OF THE 2016 INTERNATIONAL FORUM ON MECHANICAL, CONTROL AND AUTOMATION (IFMCA 2016), 2017, 113 : 10 - 18
  • [38] Key technology in distributed file system towards big data analysis
    Zhou, J. (zhoujiang@ncic.ac.cn), 1600, Science Press (51):
  • [39] The Next Boom of Big data in Biology: Multicellular datasets
    Merks, Roeland M. H.
    ERCIM NEWS, 2014, (99): : 11 - 12
  • [40] Big Data Analytics over Encrypted Datasets with Seabed
    Papadimitriou, Antonis
    Bhagwan, Ranjita
    Chandran, Nishanth
    Ramjee, Ramachandran
    Haeberlen, Andreas
    Singh, Harmeet
    Modi, Abhishek
    Badrinarayanan, Saikrishna
    PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, 2016, : 587 - 602