Survey of Distributed Computing Frameworks for Supporting Big Data Analysis

被引:15
|
作者
Sun, Xudong [1 ]
He, Yulin [1 ,2 ]
Wu, Dingming [1 ]
Huang, Joshua Zhexue [1 ,2 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[2] Guangdong Lab Artificial Intelligence & Digital Ec, Shenzhen 518107, Peoples R China
基金
中国国家自然科学基金;
关键词
Analytical models; Costs; Computational modeling; Clustering algorithms; Distributed databases; Big Data; Programming; distributed computing frameworks; big data analysis; approximate computing; MapReduce computing model; MAP-REDUCE; MAPREDUCE; PERFORMANCE; MANAGEMENT; HADOOP; TAXONOMY; SYSTEMS;
D O I
10.26599/BDMA.2022.9020014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Distributed computing frameworks are the fundamental component of distributed computing systems. They provide an essential way to support the efficient processing of big data on clusters or cloud. The size of big data increases at a pace that is faster than the increase in the big data processing capacity of clusters. Thus, distributed computing frameworks based on the MapReduce computing model are not adequate to support big data analysis tasks which often require running complex analytical algorithms on extremely big data sets in terabytes. In performing such tasks, these frameworks face three challenges: computational inefficiency due to high I/O and communication costs, non-scalability to big data due to memory limit, and limited analytical algorithms because many serial algorithms cannot be implemented in the MapReduce programming model. New distributed computing frameworks need to be developed to conquer these challenges. In this paper, we review MapReduce-type distributed computing frameworks that are currently used in handling big data and discuss their problems when conducting big data analysis. In addition, we present a non-MapReduce distributed computing framework that has the potential to overcome big data analysis challenges.
引用
收藏
页码:154 / 169
页数:16
相关论文
共 50 条
  • [21] Quantum Computing in Big Data Analytics: A Survey
    Shaikh, Tawseef Ayoub
    Ali, Rashid
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2016, : 112 - 115
  • [22] A Survey on Emerging Computing Paradigms for Big Data
    Zhang Yaoxue
    Ren Ju
    Liu Jiagang
    Xu Chugui
    Guo Hui
    Liu Yaping
    CHINESE JOURNAL OF ELECTRONICS, 2017, 26 (01) : 1 - 12
  • [23] A Survey on Emerging Computing Paradigms for Big Data
    ZHANG Yaoxue
    REN Ju
    LIU Jiagang
    XU Chugui
    GUO Hui
    LIU Yaping
    Chinese Journal of Electronics, 2017, 26 (01) : 1 - 12
  • [24] Big Data with Cloud Computing: an insight on the computing environment, MapReduce, and programming frameworks
    Fernandez, Alberto
    del Rio, Sara
    Lopez, Victoria
    Bawakid, Abdullah
    del Jesus, Maria J.
    Benitez, Jose M.
    Herrera, Francisco
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 4 (05) : 380 - 409
  • [25] A Survey on Big Data Processing Frameworks for Mobility Analytics
    Doulkeridis C.
    Vlachou A.
    Pelekis N.
    Theodoridis Y.
    SIGMOD Record, 2021, 50 (02): : 18 - 29
  • [26] Resilient Distributed Computing Platforms for Big Data Analysis Using Spark and Hadoop
    Chang, Bao Rong
    Tsai, Hsiu-Fen
    Wang, Yo-Ai
    Huang, Chien-Feng
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON APPLIED SYSTEM INNOVATION (ICASI), 2016,
  • [27] BAYESIAN ANALYSIS OF BIG DATA IN INSURANCE PREDICTIVE MODELING USING DISTRIBUTED COMPUTING
    Zhang, Yanwei
    ASTIN BULLETIN, 2017, 47 (03): : 943 - 961
  • [28] Survey on JVM Optimization for Big Data Processing Frameworks
    Wang, Yi-Cheng
    Zeng, Hong-Bin
    Xu, Li-Jie
    Wang, Wei
    Wei, Jun
    Huang, Tao
    Ruan Jian Xue Bao/Journal of Software, 2023, 34 (01): : 463 - 488
  • [29] A Survey on Big Data Processing Frameworks for Mobility Analytics
    Doulkeridis, Christos
    Vlachou, Akrivi
    Pelekis, Nikos
    Theodoridis, Yannis
    SIGMOD RECORD, 2021, 50 (02) : 18 - 30
  • [30] Big Data Security in Healthcare Survey on Frameworks and Algorithms
    Chandra, Sudipta
    Ray, Soumya
    Goswami, R. T.
    2017 7TH IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE (IACC), 2017, : 89 - 94