Composable and Efficient Functional Big Data Processing Framework

被引:0
|
作者
Wu, Dongyao [1 ,2 ]
Sakr, Sherif [1 ,3 ]
Zhu, Liming [1 ,2 ]
Lu, Qinghua [1 ,4 ]
机构
[1] NICTA, Software Syst Res Grp, Sydney, NSW, Australia
[2] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
[3] King Saud bin Abdulaziz Univ Hlth Sci, Riyadh, Saudi Arabia
[4] China Univ Petr, Coll Comp & Commun Engn, Qingdao, Peoples R China
关键词
big data processing; parallel programming; functional programming; distributed systems; system architecture;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the past years, frameworks such as MapReduce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix (HDM) which is a functional, strongly-typed data representation for writing composable big data applications. Along with HDM, a runtime framework is provided to support the execution of HDM applications on distributed infrastructures. Based on the functional data dependency graph of HDM, multiple optimizations are applied to improve the performance of executing HDM jobs. The experimental results show that our optimizations can achieve improvements of between 10% to 60% of the Job-Completion-Time for different types of operation sequences when compared with the current state of art, Apache Spark.
引用
收藏
页码:279 / 286
页数:8
相关论文
共 50 条
  • [1] HDM: A Composable Framework for Big Data Processing
    Wu, Dongyao
    Zhu, Liming
    Lu, Qinghua
    Sakr, Sherif
    IEEE TRANSACTIONS ON BIG DATA, 2018, 4 (02) : 150 - 163
  • [2] Efficient, Problem Tailored Big Data Processing Using Framework Delegation
    Davis, Nickolas
    Broomfield, Matthew
    Rezgui, Abdelmounaam
    2016 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATION (ISCC), 2016, : 1297 - 1299
  • [3] Efficient and Customizable Data Partitioning Framework for Distributed Big RDF Data Processing in the Cloud
    Lee, Kisung
    Liu, Ling
    Tang, Yuzhe
    Zhang, Qi
    Zhou, Yang
    2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 327 - 334
  • [4] Big data processing framework for manufacturing
    Ye, Yinghao
    Wang, Meilin
    Yao, Shuhong
    Jiang, Jarvis N.
    Liu, Qing
    11TH CIRP CONFERENCE ON INDUSTRIAL PRODUCT-SERVICE SYSTEMS, 2019, 83 : 661 - 664
  • [5] An efficient framework for processing big data in internet of things enabled cloud environments
    Lohitha, Sai N.
    Kumar, Pounambal Muthu
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2022, 35 (10)
  • [6] A framework for efficient and composable oblivious transfer
    Peikert, Chris
    Vaikuntanathan, Vinod
    Waters, Brent
    ADVANCES IN CRYPTOLOGY - CRYPTO 2008, PROCEEDINGS, 2008, 5157 : 554 - 571
  • [7] An Efficient and Scalable Framework for Processing Remotely Sensed Big Data in Cloud Computing Environments
    Sun, Jin
    Zhang, Yi
    Wu, Zebin
    Zhu, Yaoqin
    Yin, Xianliang
    Ding, Zhongzheng
    Wei, Zhihui
    Plaza, Javier
    Plaza, Antonio
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (07): : 4294 - 4308
  • [8] Efficient Spark-Based Framework for Big Geospatial Data Query Processing and Analysis
    Aljawarneh, Isam Mashhour
    Bellavista, Paolo
    Corradi, Antonio
    Montanari, Rebecca
    Foschini, Luca
    Zanotti, Andrea
    2017 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2017, : 851 - 856
  • [9] Efficient Big Data Processing in Hadoop MapReduce
    Dittrich, Jens
    Quiane-Ruiz, Jorge-Arnulfo
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 2014 - 2015
  • [10] An Efficient Distributed Algorithm for Big Data Processing
    Al-kahtani, Mohammed S.
    Karim, Lutful
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2017, 42 (08) : 3149 - 3157