GPU-Accelerated Parallel Hierarchical Extreme Learning Machine on Flink for Big Data

被引:81
|
作者
Chen, Cen [1 ,2 ]
Li, Kenli [1 ,2 ]
Ouyang, Aijia [1 ,2 ,3 ]
Tang, Zhuo [1 ,2 ]
Li, Keqin [1 ,2 ,4 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Natl Supercomp Ctr, Changsha 410082, Hunan, Peoples R China
[3] Zunyi Normal Coll, Dept Informat Engn, Zunyi 563006, Peoples R China
[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
基金
中国国家自然科学基金;
关键词
Big data; deep learning (DL); Flink; GPGPU; hierarchical extreme learning machine (H-ELM); parallel; FEEDFORWARD NETWORKS; HIDDEN NODES; MAPREDUCE; APPROXIMATION; CLASSIFICATION; OPTIMIZATION; REGRESSION; ALGORITHM; SPMV;
D O I
10.1109/TSMC.2017.2690673
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The extreme learning machine (ELM) has become one of the most important and popular algorithms of machine learning, because of its extremely fast training speed, good generalization, and universal approximation/classification capability. The proposal of hierarchical ELM (H-ELM) extends ELM from single hidden layer feedforward networks to multilayer perceptron, greatly strengthening the applicability of ELM. Generally speaking, during training H-ELM, large-scale datasets (DSTs) are needed. Therefore, how to make use of H-ELM framework in processing big data is worth further exploration. This paper proposes a parallel H-ELM algorithm based on Flink, which is one of the in-memory cluster computing platforms, and graphics processing units (GPUs). Several optimizations are adopted to improve the performance, such as cache-based scheme, reasonable partitioning strategy, memory mapping scheme for mapping specific Java virtual machine objects to buffers. Most importantly, our proposed framework for utilizing GPUs to accelerate Flink for big data is general. This framework can be utilized to accelerate many other variants of ELM and other machine learning algorithms. To the best of our knowledge, it is the first kind of library, which combines in-memory cluster computing with GPUs to parallelize H-ELM. The experimental results have demonstrated that our proposed GPU-accelerated parallel H-ELM named as GPH-ELM can efficiently process large-scale DSTs with good performance of speedup and scalability, leveraging the computing power of both CPUs and GPUs in the cluster.
引用
收藏
页码:2740 / 2753
页数:14
相关论文
共 50 条
  • [11] GPU-Accelerated Machine Learning Inference as a Service for Computing in Neutrino Experiments
    Wang, Michael
    Yang, Tingjun
    Flechas, Maria Acosta
    Harris, Philip
    Hawks, Benjamin
    Holzman, Burt
    Knoepfel, Kyle
    Krupa, Jeffrey
    Pedro, Kevin
    Tran, Nhan
    [J]. FRONTIERS IN BIG DATA, 2021, 3
  • [12] GPU-Accelerated Machine Learning in Non-Orthogonal Multiple Access
    Schaeufele, Daniel
    Marcus, Guillermo
    Binder, Nikolaus
    Mehlhose, Matthias
    Keller, Alexander
    Stanczak, Slawomir
    [J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 667 - 671
  • [13] A Parallel Multiclassification Algorithm for Big Data Using an Extreme Learning Machine
    Duan, Mingxing
    Li, Kenli
    Liao, Xiangke
    Li, Keqin
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2337 - 2351
  • [14] Model sharing for GPU-accelerated DNN inference in big data processing systems
    Ding G.
    Chen Q.
    Xu C.
    Qian W.
    Zhou A.
    [J]. Qinghua Daxue Xuebao/Journal of Tsinghua University, 2022, 62 (09): : 1435 - 1441
  • [15] GPU-Accelerated Parallel FDTD on Distributed Heterogeneous Platform
    Jiang, Ronglin
    Jiang, Shugang
    Zhang, Yu
    Xu, Ying
    Xu, Lei
    Zhang, Dandan
    [J]. INTERNATIONAL JOURNAL OF ANTENNAS AND PROPAGATION, 2014, 2014
  • [16] A GPU-accelerated parallel K-means algorithm
    Cuomo, S.
    De Angelis, V.
    Farina, G.
    Marcellino, L.
    Toraldo, G.
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2019, 75 : 262 - 274
  • [17] Measuring GPU-Accelerated Parallel SVM Performance Using Large Datasets for Multi-Class Machine Learning Problem
    Bin Sulaiman, Muhamad Abdul Hay
    Suliman, Azizah
    Ahmad, Abdul Rahim
    [J]. PROCEEDINGS OF THE 2014 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND MULTIMEDIA (ICIM), 2014, : 299 - 302
  • [18] GPU-Accelerated RDP Algorithm for Data Segmentation
    Cebrian, Pau
    Moure, Juan Carlos
    [J]. COMPUTATIONAL SCIENCE - ICCS 2020, PT I, 2020, 12137 : 234 - 247
  • [19] GPU-Accelerated Visualization of Scattered Point Data
    Falch, Thomas L.
    Floystad, Jostein Bo
    Breiby, Dag W.
    Elster, Anne C.
    [J]. IEEE ACCESS, 2013, 1 : 564 - 576
  • [20] GPU-Accelerated Mahalanobis-Average Hierarchical Clustering Analysis
    Smelko, Adam
    Kratochvil, Miroslav
    Krulis, Martin
    Sieger, Tomas
    [J]. EURO-PAR 2021: PARALLEL PROCESSING, 2021, 12820 : 580 - 595