Lit: A High Performance Massive Data Computing Framework Based on CPU/GPU Cluster

被引:0
|
作者
Zhai, Yanlong [1 ]
Mbarushimana, Emmanuel [1 ]
Li, Wei [2 ]
Zhang, Jing [2 ]
Guo, Ying [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Engn Res Ctr Mass Language Informat Proc, Beijing 100081, Peoples R China
[2] Sci & Technol Complex Syst Simulat, Beijing, Peoples R China
关键词
MAPREDUCE;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Big data processing is receiving significant amount of interest as an important technology to reveal the information behind the data, such as trends, characteristics, etc. MapReduce is considered as the most efficient distributed parallel data processing framework. However, some high-end applications, especially some scientific analyses have both data-intensive and computation-intensive features. Current big data processing techniques like Hadoop are not designed for computation-intensive applications, thus have insufficient computation power. In this paper, we presented Lit, a high performance massive data computing framework based on CPU/GPU cluster. Lit integrated GPU with Hadoop to improve the computational power of each node in the cluster. Since the architecture and programming model of GPU is different from CPU, Lit provided an annotation based approach to automatically generate CUDA codes from Hadoop codes. Lit hided the complexity of programming on CPU/GPU cluster by providing extended compiler and optimizer. To utilize the simplified programming, scalability and fault tolerance benefits of Hadoop and combine them with the high performance computation power of GPU, Lit extended the Hadoop by applying a GPUClassloader to detect the GPU, generate and compile CUDA codes, and invoke the shared library. Our experimental results show that Lit can achieve an average speedup of 1x to 3x on three typical applications over Hadoop.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Performance models for CPU-GPU data transfers
    van Werkhoven, B.
    Maassen, J.
    Seinstra, F. J.
    Bal, H. E.
    2014 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2014, : 11 - 20
  • [22] Multi2Sim: A Simulation Framework for CPU-GPU Computing
    Ubal, Rafael
    Jang, Yunghyun
    Mistry, Perhaad
    Schaa, Dana
    Kaeli, David
    PROCEEDINGS OF THE 21ST INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'12), 2012, : 335 - 344
  • [23] A high-performance multiscale space-time approach to high cycle fatigue simulation based on hybrid CPU/GPU computing
    Zhang, Rui
    Naboulsi, Sam
    Eason, Thomas
    Qian, Dong
    FINITE ELEMENTS IN ANALYSIS AND DESIGN, 2019, 166
  • [24] A Novel Multi-CPU/GPU Collaborative Computing Framework for SGD-based Matrix Factorization
    Huang, Yizhi
    Yin, Yanlong
    Liu, Yan
    He, Shuibing
    Bai, Yang
    Li, Renfa
    50TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, 2021,
  • [25] Cluster optimization algorithm based on CPU and GPU hybrid architecture
    Fei Yin
    Feng Shi
    Cluster Computing, 2022, 25 : 2601 - 2611
  • [26] A Deep Collaborative Computing Based SAR Raw Data Simulation on Multiple CPU/GPU Platform
    Zhang, Fan
    Hu, Chen
    Li, Wei
    Hu, Wei
    Wang, Pengbo
    Li, Heng-Chao
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2017, 10 (02) : 387 - 399
  • [27] Cluster optimization algorithm based on CPU and GPU hybrid architecture
    Yin, Fei
    Shi, Feng
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2022, 25 (04): : 2601 - 2611
  • [28] Heterogeneous Hadoop Cluster-Based Image Processing Workload Distribution Framework between CPU and GPU
    Naz N.
    Zada I.
    Malik A.H.
    Nadeem M.
    Ali S.
    Scientific Programming, 2023, 2023
  • [29] Attempt of unbiased comparison of GPU and CPU performance in common scientific computing
    Hidic, Adnan
    Zubanovic, Damir
    Hajdarevic, Adnan
    Huseinovic, Alvin
    Nosovic, Novica
    2012 IX INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (BIHTEL), 2012,
  • [30] Retraction Note: Implementation of MapReduce parallel computing framework based on multi-data fusion sensors and GPU cluster
    Dajun Chang
    Li Li
    Ying Chang
    Zhangquan Qiao
    EURASIP Journal on Advances in Signal Processing, 2022