Lit: A High Performance Massive Data Computing Framework Based on CPU/GPU Cluster

被引:0
|
作者
Zhai, Yanlong [1 ]
Mbarushimana, Emmanuel [1 ]
Li, Wei [2 ]
Zhang, Jing [2 ]
Guo, Ying [1 ]
机构
[1] Beijing Inst Technol, Sch Comp Sci, Beijing Engn Res Ctr Mass Language Informat Proc, Beijing 100081, Peoples R China
[2] Sci & Technol Complex Syst Simulat, Beijing, Peoples R China
关键词
MAPREDUCE;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Big data processing is receiving significant amount of interest as an important technology to reveal the information behind the data, such as trends, characteristics, etc. MapReduce is considered as the most efficient distributed parallel data processing framework. However, some high-end applications, especially some scientific analyses have both data-intensive and computation-intensive features. Current big data processing techniques like Hadoop are not designed for computation-intensive applications, thus have insufficient computation power. In this paper, we presented Lit, a high performance massive data computing framework based on CPU/GPU cluster. Lit integrated GPU with Hadoop to improve the computational power of each node in the cluster. Since the architecture and programming model of GPU is different from CPU, Lit provided an annotation based approach to automatically generate CUDA codes from Hadoop codes. Lit hided the complexity of programming on CPU/GPU cluster by providing extended compiler and optimizer. To utilize the simplified programming, scalability and fault tolerance benefits of Hadoop and combine them with the high performance computation power of GPU, Lit extended the Hadoop by applying a GPUClassloader to detect the GPU, generate and compile CUDA codes, and invoke the shared library. Our experimental results show that Lit can achieve an average speedup of 1x to 3x on three typical applications over Hadoop.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] RETRACTED ARTICLE: Implementation of MapReduce parallel computing framework based on multi-data fusion sensors and GPU cluster
    Dajun Chang
    Li Li
    Ying Chang
    Zhangquan Qiao
    EURASIP Journal on Advances in Signal Processing, 2021
  • [32] High performance CPU/GPU multiresolution Poisson solver
    Van Rees, Wim M.
    Rossinelli, Diego
    Hadjidoukas, Panagiotis
    Koumoutsakos, Petros
    PARALLEL COMPUTING: ACCELERATING COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2014, 25 : 481 - 490
  • [33] High Dimensional Pricing of Exotic European Contracts on a GPU Cluster, and Comparison to a CPU Cluster
    Abbas-Turki, Lokman A.
    Vialle, Stephane
    Lapeyre, Bernard
    Mercier, Patrick
    2009 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-5, 2009, : 2414 - +
  • [34] Upgrading a high performance computing environment for massive data processing
    Ponce, Lucas M.
    dos Santos, Walter
    Meira, Wagner, Jr.
    Guedes, Dorgival
    Lezzi, Daniele
    Badia, Rosa M.
    JOURNAL OF INTERNET SERVICES AND APPLICATIONS, 2019, 10 (01)
  • [35] The High Performance Computing for 3D Dynamic Holographic Simulation Based on Multi-GPU Cluster
    Zhang Yingxi
    Lin Tingyu
    Guo Liqin
    THEORY, METHODOLOGY, TOOLS AND APPLICATIONS FOR MODELING AND SIMULATION OF COMPLEX SYSTEMS, PT I, 2016, 643 : 431 - 441
  • [36] A CPU-GPU HYBRID COMPUTING FRAMEWORK FOR REAL-TIME CLOTHING ANIMATION
    Li, Hanwen
    Wan, Yi
    Ma, Guanghui
    2011 IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTELLIGENCE SYSTEMS, 2011, : 391 - 396
  • [37] A Real-Time Big Data Analysis Framework on a CPU/GPU Heterogeneous Cluster A Meteorological Application Case Study
    Hassaan, Mohamed
    Elghandour, Iman
    2016 3RD IEEE/ACM INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING, APPLICATIONS AND TECHNOLOGIES (BDCAT), 2016, : 168 - 177
  • [38] A unified schedule policy of distributed machine learning framework for CPU-GPU cluster
    Zhu, Ziyu
    Tang, Xiaochun
    Zhao, Quan
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2021, 39 (03): : 529 - 538
  • [39] Molecular Docking Simulation Based on CPU-GPU Heterogeneous Computing
    Xu, Jinyan
    Li, Jianhua
    Cai, Yining
    ADVANCED PARALLEL PROCESSING TECHNOLOGIES, 2017, 10561 : 27 - 37
  • [40] HPCGCN: A Predictive Framework on High Performance Computing Cluster Log Data Using Graph Convolutional Networks
    Bose, Avishek
    Yang, Huichen
    Hsu, William H.
    Andresen, Daniel
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 4113 - 4118