Apache Spark and Apache Ignite Performance Analysis

被引:6
|
作者
Stan, Cristiana-Stefania [1 ]
Pandelica, Adrian-Eduard [1 ]
Zamfir, Vlad-Andrei [1 ]
Stan, Roxana Gabriela [1 ]
Negru, Catalin [1 ]
机构
[1] Univ Politehn Bucuresti, Dept Comp Sci, Bucharest, Romania
关键词
Big Data; Apache Spark; Ignite; performance evaluation;
D O I
10.1109/CSCS.2019.00129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Big Data represents an actual research topic. More and more it becomes part of people life's through different applications that are used daily, such as stock exchange, news, social media, health-care. All these applications make use of Big Data technologies for storing and processing information. There have been developed numerous technologies for implementing Big Data requirements and it is interesting to follow their strengths and weaknesses, when to use one over another and how well they perform in different situations. In this paper, we compare two frameworks Apache Spark and Ignite that are used for data processing. We perform the comparison taking into consideration the following aspects: features, implementation, architecture, and performance metrics. In order to test the performance, we used two popular applications such as word count and k-means clustering. Results show that Spark achieved better performance than Ignite.
引用
收藏
页码:726 / 733
页数:8
相关论文
共 50 条
  • [31] Data Processing Performance of Apache Spark on Beowulf Clusters: An Overview
    Cluci, Marius-Iulian
    Fotache, Mann
    Greavu-Serban, Valerica
    [J]. VISION 2025: EDUCATION EXCELLENCE AND MANAGEMENT OF INNOVATIONS THROUGH SUSTAINABLE ECONOMIC COMPETITIVE ADVANTAGE, 2019, : 12929 - 12938
  • [32] Performance Prediction for Data-driven Workflows on Apache Spark
    Gulino, Andrea
    Canakoglu, Arif
    Ceri, Stefano
    Ardagna, Danilo
    [J]. 2020 IEEE 28TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS 2020), 2020, : 167 - +
  • [33] Performance comparison of Dask and Apache Spark on HPC systems for neuroimaging
    Dugre, Mathieu
    Hayot-Sasson, Valerie
    Glatard, Tristan
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21):
  • [34] Adaptive performance model for dynamic scaling Apache Spark Streaming
    Petrov, Max
    Butakov, Nikolay
    Nasonov, Denis
    Melnik, Mikhail
    [J]. 7TH INTERNATIONAL YOUNG SCIENTISTS CONFERENCE ON COMPUTATIONAL SCIENCE, YSC2018, 2018, 136 : 109 - 117
  • [35] Performance Analysis of ECG Big Data using Apache Hive and Apache Pig
    Ahmad, Mudassar
    Kanwal, Safina
    Cheema, Maryam
    Habib, Muhammad Asif
    [J]. 2019 8TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICICT 2019), 2019, : 2 - 7
  • [36] Framing Apache Spark in life sciences
    Manconi, Andrea
    Gnocchi, Matteo
    Milanesi, Luciano
    Marullo, Osvaldo
    Armano, Giuliano
    [J]. HELIYON, 2023, 9 (02)
  • [37] ReForeSt: Random Forests in Apache Spark
    Lulli, Alessandro
    Oneto, Luca
    Anguita, Davide
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, PT II, 2017, 10614 : 331 - 339
  • [38] Computational Performance Analysis of the Distributed Processing of a Word Bag Implementation in Apache Spark TM
    Porras-Garcia, Yerson
    Calderon-Moreno, Roger
    Cruz-Roa, Angel
    [J]. 2018 IEEE COLOMBIAN CONFERENCE ON COMMUNICATIONS AND COMPUTING (COLCOM), 2018,
  • [39] Execution of Recursive Queries in Apache Spark
    Katsogridakis, Pavlos
    Papagiannaki, Sofia
    Pratikakis, Polyvios
    [J]. EURO-PAR 2017: PARALLEL PROCESSING, 2017, 10417 : 289 - 302
  • [40] Analyze the Rainfall of Landslide on Apache Spark
    Lee, Chou-Yuan
    Huang, Jian-Qiong
    Ma, Wei-Ping
    Weng, Yu-Lin
    Lee, Yuan-Chih
    Lee, Zne-Jung
    [J]. PROCEEDINGS OF 2018 TENTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2018, : 348 - 351