GPU in-memory processing using Spark for iterative computation

被引:17
|
作者
Hong, Sumin [1 ]
Choi, Woohyuk [1 ]
Jeong, Won-Ki [1 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Sch Elect & Comp Engn, Ulsan, South Korea
基金
新加坡国家研究基金会;
关键词
Spark; MapReduce; GPU; In-memory Computing; FRAMEWORK;
D O I
10.1109/CCGRID.2017.41
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to its simplicity and scalability, MapReduce has become a de facto standard computing model for big data processing. Since the original MapReduce model was only appropriate for embarrassingly parallel batch processing, many follow-up studies have focused on improving the efficiency and performance of the model. Spark follows one of these recent trends by providing in-memory processing capability to reduce slow disk I/O for iterative computing tasks. However, the acceleration of Spark's in-memory processing using graphics processing units (GPUs) is challenging due to its deep memory hierarchy and host-to-GPU communication overhead. In this paper, we introduce a novel GPU-accelerated MapReduce framework that extends Spark's in-memory processing so that iterative computing is performed only in the GPU memory. Having discovered that the main bottleneck in the current Spark system for GPU computing is data communication on a Java virtual machine, we propose a modification of the current Spark implementation to bypass expensive data management for iterative task offloading to GPUs. We also propose a novel GPU in-memory processing and caching framework that minimizes host-to-GPU communication via lazy evaluation and reuses GPU memory over multiple mapper executions. The proposed system employs message-passing interface (MPI)-based data synchronization for inter-worker communication so that more complicated iterative computing tasks, such as iterative numerical solvers, can be efficiently handled. We demonstrate the performance of our system in terms of several iterative computing tasks in big data processing applications, including machine learning and scientific computing. We achieved up to 50 times speed up over conventional Spark and about 10 times speed up over GPU-accelerated Spark.
引用
收藏
页码:31 / 41
页数:11
相关论文
共 50 条
  • [1] Spark-GPU: An Accelerated In-Memory Data Processing Engine on Clusters
    Yuan, Yuan
    Salmi, Meisam Fathi
    Huai, Yin
    Wang, Kaibo
    Lee, Rubao
    Zhang, Xiaodong
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 273 - 283
  • [2] In-memory Distributed Matrix Computation Processing and Optimization
    Yu, Yongyang
    Tang, Mingjie
    Aref, Walid G.
    Malluhi, Qutaibah M.
    Abbas, Mostafa M.
    Ouzzani, Mourad
    [J]. 2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 1047 - 1058
  • [3] An Adaptive Tuning Strategy on Spark Based on In-memory Computation Characteristics
    Zhao, Yao
    Hu, Fei
    Chen, Haopeng
    [J]. 2016 18TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATIONS TECHNOLOGY (ICACT) - INFORMATION AND COMMUNICATIONS FOR SAFE AND SECURE LIFE, 2016, : 484 - 488
  • [4] Performance enhancement for iterative data computing with in-memory concurrent processing
    Wen, Yean-Fu
    Chen, Yu-Fang
    Chiu, Tse Kai
    Chen, Yen-Chou
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (07):
  • [5] In-memory k Nearest Neighbor GPU-based Query Processing
    Velentzas, Polychronis
    Vassilakopoulos, Michael
    Corral, Antonio
    [J]. PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON GEOGRAPHICAL INFORMATION SYSTEMS THEORY, APPLICATIONS AND MANAGEMENT (GISTAM), 2020, : 310 - 317
  • [6] Towards Memory and Computation Efficient Graph Processing on Spark
    Tian, Xinhui
    Guo, Yuanqing
    Zhan, Jianfeng
    Wang, Lei
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 375 - 382
  • [7] Processing data where it makes sense: Enabling in-memory computation
    Mutlu, Onur
    Ghose, Saugata
    Gomez-Luna, Juan
    Ausavarungnirun, Rachata
    [J]. MICROPROCESSORS AND MICROSYSTEMS, 2019, 67 : 28 - 41
  • [8] Efficient In-Memory Processing Using Spintronics
    Chowdhury, Zamshed
    Harms, Jonathan D.
    Khatamifard, S. Karen
    Zabihi, Masoud
    Lv, Yang
    Lyle, Andrew P.
    Sapatnekar, Sachin S.
    Karpuzcu, Ulya R.
    Wang, Jian-Ping
    [J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2018, 17 (01) : 42 - 46
  • [9] Memory Processing Unit for In-Memory Processing
    Ben Hur, Rotem
    Kvatinsky, Shahar
    [J]. PROCEEDINGS OF THE 2016 IEEE/ACM INTERNATIONAL SYMPOSIUM ON NANOSCALE ARCHITECTURES (NANOARCH), 2016, : 171 - 172
  • [10] Inner Product Computation In-Memory Using Distributed Arithmetic
    Lakshmi, Vijaya
    Pudi, Vikramkumar
    Reuben, John
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (11) : 4546 - 4557