GPU-Accelerated Parallel Hierarchical Extreme Learning Machine on Flink for Big Data

被引:81
|
作者
Chen, Cen [1 ,2 ]
Li, Kenli [1 ,2 ]
Ouyang, Aijia [1 ,2 ,3 ]
Tang, Zhuo [1 ,2 ]
Li, Keqin [1 ,2 ,4 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Natl Supercomp Ctr, Changsha 410082, Hunan, Peoples R China
[3] Zunyi Normal Coll, Dept Informat Engn, Zunyi 563006, Peoples R China
[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
基金
中国国家自然科学基金;
关键词
Big data; deep learning (DL); Flink; GPGPU; hierarchical extreme learning machine (H-ELM); parallel; FEEDFORWARD NETWORKS; HIDDEN NODES; MAPREDUCE; APPROXIMATION; CLASSIFICATION; OPTIMIZATION; REGRESSION; ALGORITHM; SPMV;
D O I
10.1109/TSMC.2017.2690673
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The extreme learning machine (ELM) has become one of the most important and popular algorithms of machine learning, because of its extremely fast training speed, good generalization, and universal approximation/classification capability. The proposal of hierarchical ELM (H-ELM) extends ELM from single hidden layer feedforward networks to multilayer perceptron, greatly strengthening the applicability of ELM. Generally speaking, during training H-ELM, large-scale datasets (DSTs) are needed. Therefore, how to make use of H-ELM framework in processing big data is worth further exploration. This paper proposes a parallel H-ELM algorithm based on Flink, which is one of the in-memory cluster computing platforms, and graphics processing units (GPUs). Several optimizations are adopted to improve the performance, such as cache-based scheme, reasonable partitioning strategy, memory mapping scheme for mapping specific Java virtual machine objects to buffers. Most importantly, our proposed framework for utilizing GPUs to accelerate Flink for big data is general. This framework can be utilized to accelerate many other variants of ELM and other machine learning algorithms. To the best of our knowledge, it is the first kind of library, which combines in-memory cluster computing with GPUs to parallelize H-ELM. The experimental results have demonstrated that our proposed GPU-accelerated parallel H-ELM named as GPH-ELM can efficiently process large-scale DSTs with good performance of speedup and scalability, leveraging the computing power of both CPUs and GPUs in the cluster.
引用
收藏
页码:2740 / 2753
页数:14
相关论文
共 50 条
  • [41] GPU-Accelerated Data-driven Framework of Hybrid ReaxFF
    Wang, Xing Quan
    Lau, Denvid
    [J]. NONDESTRUCTIVE CHARACTERIZATION AND MONITORING OF ADVANCED MATERIALS, AEROSPACE, CIVIL INFRASTRUCTURE, AND TRANSPORTATION XVIII, 2024, 12950
  • [42] Impact of data layouts on the efficiency of GPU-accelerated IDW interpolation
    Mei, Gang
    Tian, Hong
    [J]. SPRINGERPLUS, 2016, 5 : 1 - 18
  • [43] Improvements of classification accuracy of film defects by using GPU-accelerated image processing and machine learning frameworks
    Ando, Hidetoshi
    Niitsu, Yuki
    Hirasawa, Masaki
    Teduka, Hiroaki
    Yajima, Masao
    [J]. PROCEEDINGS NICOGRAPH INTERNATIONAL 2016, 2016, : 83 - 87
  • [44] HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics
    Goetz, Markus
    Debus, Charlotte
    Coquelin, Daniel
    Krajsek, Kai
    Comito, Claudia
    Knechtges, Philipp
    Hagemeier, Bjorn
    Tarnawa, Michael
    Hanselmann, Simon
    Siggel, Martin
    Basermann, Achim
    Streit, Achim
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 276 - 287
  • [45] Distributed Weighted Extreme Learning Machine for Big Imbalanced Data Learning
    Wang, Zhiqiong
    Xin, Junchang
    Tian, Shuo
    Yu, Ge
    [J]. PROCEEDINGS OF ELM-2015, VOL 1: THEORY, ALGORITHMS AND APPLICATIONS (I), 2016, 6 : 319 - 332
  • [46] Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning
    Wang, Zhiqiong
    Xin, Junchang
    Yang, Hongxu
    Tian, Shuo
    Yu, Ge
    Xu, Chenren
    Yao, Yudong
    [J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2017, 22 (02) : 160 - 173
  • [47] Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning
    Zhiqiong Wang
    Junchang Xin
    Hongxu Yang
    Shuo Tian
    Ge Yu
    Chenren Xu
    Yudong Yao
    [J]. Tsinghua Science and Technology, 2017, 22 (02) : 160 - 173
  • [48] A GPU-accelerated adaptive kernel density estimation approach for efficient point pattern analysis on spatial big data
    Zhang, Guiming
    Zhu, A-Xing
    Huang, Qunying
    [J]. INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2017, 31 (10) : 2068 - 2097
  • [49] THE FAILURE ANALYSIS OF EXTREME LEARNING MACHINE ON BIG DATA AND THE COUNTERMEASURE
    Zhang, Pei-Zhou
    Zhao, Shi-Xin
    Wang, Xi-Zhao
    [J]. PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOL. 2, 2015, : 849 - 853
  • [50] Porting and scaling OpenACC applications on massively-parallel, GPU-accelerated supercomputers
    A. Hart
    R. Ansaloni
    A. Gray
    [J]. The European Physical Journal Special Topics, 2012, 210 : 5 - 16