Integration of Large-Scale Data Processing Systems and Traditional Parallel Database Technology

被引:1
|
作者
Abouzied, Azza [1 ]
Abadi, Daniel J. [2 ]
Bajda-Pawlikowski, Kamil [3 ]
Silberschatz, Avi [4 ]
机构
[1] New York Univ Abu Dhabi, Abu Dhabi, U Arab Emirates
[2] Univ Maryland, College Pk, MD 20742 USA
[3] Starburst Data, Boston, MA USA
[4] Yale Univ, New Haven, CT 06520 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2019年 / 12卷 / 12期
基金
美国国家科学基金会;
关键词
D O I
10.14778/3352063.3352145
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In 2009 we explored the feasibility of building a hybrid SQL data analysis system that takes the best features from two competing technologies: large-scale data processing systems (such as Google MapReduce and Apache Hadoop) and parallel database management systems (such as Greenplum and Vertica). We built a prototype, HadoopDB, and demonstrated that it can deliver the high SQL query performance and efficiency of parallel database management systems while still providing the scalability, fault tolerance, and flexibility of large-scale data processing systems. Subsequently, HadoopDB grew into a commercial product, Hadapt, whose technology was eventually acquired by Teradata. In this paper, we provide an overview of HadoopDB's original design, and its evolution during the subsequent ten years of research and development effort. We describe how the project innovated both in the research lab, and as a commercial product at Hadapt and Teradata. We then discuss the current vibrant ecosystem of software projects (most of which are open source) that continued HadoopDB's legacy of implementing a systems level integration of large-scale data processing systems and parallel database technology.
引用
收藏
页码:2290 / 2299
页数:10
相关论文
共 50 条
  • [1] LARGE-SCALE PARALLEL PROCESSING SYSTEMS
    SIEGEL, HJ
    SCHWEDERSKI, T
    MEYER, DG
    HSU, WT
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 1987, 11 (01) : 3 - 20
  • [2] Parallel Strategy for the Large-Scale Data Streams Processing
    Yuan, Ya-Juan
    Ma, Guo-Jie
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND INFORMATION SYSTEMS, 2016, 52 : 232 - 234
  • [3] Large-scale parallel numerical integration
    de Doncker, E
    Gupta, A
    Zanny, RR
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1999, 112 (1-2) : 29 - 44
  • [4] Designing Parallel Data Processing for Large-Scale Sensor Orchestration
    Kabac, Milan
    Consel, Charles
    [J]. 2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 57 - 65
  • [5] Extension of Parallel Primitives and Their Applications to Large-Scale Data Processing
    Nakano, Masashi
    Chang, Qiong
    Miyazaki, Jun
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, PT II, DEXA 2024, 2024, 14911 : 248 - 253
  • [6] On the application of parallel database technology for large scale document management systems
    Clausnitzer, A
    Jaedicke, M
    Mitschang, B
    Nippl, C
    Reiser, A
    Zimmermann, S
    [J]. IDEAS '97 - INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1997, : 388 - 396
  • [7] The Family of MapReduce and Large-Scale Data Processing Systems
    Sakr, Sherif
    Liu, Anna
    Fayoumi, Ayman G.
    [J]. ACM COMPUTING SURVEYS, 2013, 46 (01)
  • [8] Distributed frameworks and parallel algorithms for processing large-scale geographic data
    Hawick, KA
    Coddington, PD
    James, HA
    [J]. PARALLEL COMPUTING, 2003, 29 (10) : 1297 - 1333
  • [9] HPPQ: A Parallel Package Queries Processing Approach for Large-Scale Data
    Meihui Shi
    Derong Shen
    Tiezheng Nie
    Yue Kou
    Ge Yu
    [J]. Big Data Mining and Analytics, 2018, 1 (02) : 146 - 159
  • [10] Designing parallel data processing for enabling large-scale sensor applications
    Milan Kabáč
    Charles Consel
    Nic Volanschi
    [J]. Personal and Ubiquitous Computing, 2017, 21 : 457 - 473