BOOM Analytics: Exploring Data-Centric, Declarative Programming for the Cloud

被引:0
|
作者
Alvaro, Peter [1 ]
Condie, Tyson [1 ]
Conway, Neil [1 ]
Elmeleegy, Khaled
Hellerstein, Joseph M. [1 ]
Sears, Russell [1 ]
机构
[1] UC Berkeley, Berkeley, CA 94720 USA
关键词
Cloud Computing; Datalog; MapReduce;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Building and debugging distributed software remains extremely difficult. We conjecture that by adopting a data-centric approach to system design and by employing declarative programming languages, a broad range of distributed software can be recast naturally in a data-parallel programming model. Our hope is that this model can significantly raise the level of abstraction for programmers, improving code simplicity, speed of development, ease of software evolution, and program correctness. This paper presents our experience with an initial large-scale experiment in this direction. First, we used the Over log language to implement a "Big Data" analytics stack that is API-compatible with Hadoop and HDFS and provides comparable performance. Second, we extended the system with complex distributed features not yet available in Hadoop, including high availability, scalability, and unique monitoring and debugging facilities. We present both quantitative and anecdotal results from our experience, providing some concrete evidence that both data-centric design and declarative languages can substantially simplify distributed systems programming.
引用
收藏
页码:223 / 236
页数:14
相关论文
共 50 条
  • [1] T*: A Data-Centric Cooling Energy Costs Reduction Approach for Big Data Analytics Cloud
    Kaushik, Rini T.
    Nahrstedt, Klara
    [J]. 2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [2] Metall: A persistent memory allocator for data-centric analytics
    Iwabuchi, Keita
    Youssef, Karim
    Velusamy, Kaushik
    Gokhale, Maya
    Pearce, Roger
    [J]. PARALLEL COMPUTING, 2022, 111
  • [3] Data-centric disambiguation for data transformation with programming-by-example
    Narita, Minori
    Maudet, Nolwenn
    Lu, Yi
    Igarashi, Takeo
    [J]. IUI '21 - 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT USER INTERFACES, 2021, : 454 - 463
  • [4] Data-Centric Programming Environment for Cooperative Applications in WSN
    Mori, Shunsuke
    Umedu, Takaaki
    Hiromori, Akihito
    Yamaguchi, Hirozumi
    Higashino, Teruo
    [J]. 2013 IFIP/IEEE INTERNATIONAL SYMPOSIUM ON INTEGRATED NETWORK MANAGEMENT (IM 2013), 2013, : 856 - 859
  • [5] POLYTICS: Provenance- Based Analytics of Data-Centric Applications
    Bourhis, Pierre
    Deutch, Daniel
    Moskovitch, Yuval
    [J]. 2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 1373 - 1374
  • [6] A Novel and Flexible Cloud Architecture for Data-Centric Applications
    Mandal, Amit Kr
    Changder, Suvamoy
    Sarkar, Anirban
    Debnath, Narayan C.
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT), 2013, : 1834 - 1839
  • [7] A framework with data-centric accountability and auditability for cloud storage
    Jin, Hao
    Zhou, Ke
    Luo, Yan
    [J]. JOURNAL OF SUPERCOMPUTING, 2018, 74 (11): : 5903 - 5926
  • [8] A framework with data-centric accountability and auditability for cloud storage
    Hao Jin
    Ke Zhou
    Yan Luo
    [J]. The Journal of Supercomputing, 2018, 74 : 5903 - 5926
  • [9] Dynamic Load Balancing in Cloud A Data-Centric Approach
    Dasoriya, Rayan
    Kotadiya, Purvi
    Arya, Garima
    Nayak, Priyanshu
    Mistry, Kamal
    [J]. 2017 INTERNATIONAL CONFERENCE ON NETWORKS & ADVANCES IN COMPUTATIONAL TECHNOLOGIES (NETACT), 2017, : 162 - 166
  • [10] Decentralized orchestration of data-centric workflows in Cloud environments
    Javadi, Bahman
    Tomko, Martin
    Sinnott, Richard O.
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2013, 29 (07): : 1826 - 1837