Composable and Efficient Functional Big Data Processing Framework

被引:0
|
作者
Wu, Dongyao [1 ,2 ]
Sakr, Sherif [1 ,3 ]
Zhu, Liming [1 ,2 ]
Lu, Qinghua [1 ,4 ]
机构
[1] NICTA, Software Syst Res Grp, Sydney, NSW, Australia
[2] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW, Australia
[3] King Saud bin Abdulaziz Univ Hlth Sci, Riyadh, Saudi Arabia
[4] China Univ Petr, Coll Comp & Commun Engn, Qingdao, Peoples R China
关键词
big data processing; parallel programming; functional programming; distributed systems; system architecture;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Over the past years, frameworks such as MapReduce and Spark have been introduced to ease the task of developing big data programs and applications. However, the jobs in these frameworks are roughly defined and packaged as executable jars without any functionality being exposed or described. This means that deployed jobs are not natively composable and reusable for subsequent development. Besides, it also hampers the ability for applying optimizations on the data flow of job sequences and pipelines. In this paper, we present the Hierarchically Distributed Data Matrix (HDM) which is a functional, strongly-typed data representation for writing composable big data applications. Along with HDM, a runtime framework is provided to support the execution of HDM applications on distributed infrastructures. Based on the functional data dependency graph of HDM, multiple optimizations are applied to improve the performance of executing HDM jobs. The experimental results show that our optimizations can achieve improvements of between 10% to 60% of the Job-Completion-Time for different types of operation sequences when compared with the current state of art, Apache Spark.
引用
收藏
页码:279 / 286
页数:8
相关论文
共 50 条
  • [21] A Hierarchical Distributed Processing Framework for Big Image Data
    Dong, Le
    Lin, Zhiyu
    Liang, Yan
    He, Ling
    Zhang, Ning
    Chen, Qi
    Cao, Xiaochun
    Izquierdo, Ebroul
    IEEE Transactions on Big Data, 2016, 2 (04): : 297 - 309
  • [22] Big Data processing: Is there a framework suitable for Economists and Statisticians?
    Bruno, Giuseppe
    Condello, Demetrio
    Falzone, Alberto
    Luciani, Andrea
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 2804 - 2811
  • [23] A New Big Data Processing Framework for the Online Roadshow
    Leow, Kang-Ren
    Leow, Meng-Chew
    Ong, Lee-Yeng
    BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (03)
  • [24] Big Data Pre-Processing: A Quality Framework
    Taleb, Ikbal
    Dssouli, Rachida
    Serhani, Mohamed Adel
    2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, : 191 - 198
  • [25] Big Data processing: Is there a framework suitable for Economists and Statisticians?
    Bruno, Giuseppe
    Condello, Demetrio
    Falzone, Alberto
    Luciani, Andrea
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4204 - 4211
  • [26] ITISS: an efficient framework for querying big temporal data
    Zhongpu Chen
    Bin Yao
    Zhi-Jie Wang
    Wei Zhang
    Kai Zheng
    Panos Kalnis
    Feilong Tang
    GeoInformatica, 2020, 24 : 27 - 59
  • [27] ITISS: an efficient framework for querying big temporal data
    Chen, Zhongpu
    Yao, Bin
    Wang, Zhi-Jie
    Zhang, Wei
    Zheng, Kai
    Kalnis, Panos
    Tang, Feilong
    GEOINFORMATICA, 2020, 24 (01) : 27 - 59
  • [28] An Efficient Framework for the Analysis of Big Brain Signals Data
    Supriya
    Siuly
    Wang, Hua
    Zhang, Yanchun
    DATABASES THEORY AND APPLICATIONS, ADC 2018, 2018, 10837 : 199 - 207
  • [29] Composable architecture for rack scale big data computing
    Li, Chung-Sheng
    Franke, Hubertus
    Parris, Colin
    Abali, Bulent
    Kesavan, Mukil
    Chang, Victor
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 67 : 180 - 193
  • [30] A Big Data Processing Framework for Polarity Detection in Social Network Data
    Victor, Princy
    Lijo, V. P.
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 291 - 295