GPU-Accelerated Parallel Hierarchical Extreme Learning Machine on Flink for Big Data

被引：81

作者：

Chen, Cen ^{[1
,2
]}

Li, Kenli ^{[1
,2
]}

Ouyang, Aijia ^{[1
,2
,3
]}

Tang, Zhuo ^{[1
,2
]}

Li, Keqin ^{[1
,2
,4
]}

机构：

[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China

[2] Natl Supercomp Ctr, Changsha 410082, Hunan, Peoples R China

[3] Zunyi Normal Coll, Dept Informat Engn, Zunyi 563006, Peoples R China

[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA

来源：

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2017年 / 47卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Big data; deep learning (DL); Flink; GPGPU; hierarchical extreme learning machine (H-ELM); parallel; FEEDFORWARD NETWORKS; HIDDEN NODES; MAPREDUCE; APPROXIMATION; CLASSIFICATION; OPTIMIZATION; REGRESSION; ALGORITHM; SPMV;

D O I：

10.1109/TSMC.2017.2690673

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The extreme learning machine (ELM) has become one of the most important and popular algorithms of machine learning, because of its extremely fast training speed, good generalization, and universal approximation/classification capability. The proposal of hierarchical ELM (H-ELM) extends ELM from single hidden layer feedforward networks to multilayer perceptron, greatly strengthening the applicability of ELM. Generally speaking, during training H-ELM, large-scale datasets (DSTs) are needed. Therefore, how to make use of H-ELM framework in processing big data is worth further exploration. This paper proposes a parallel H-ELM algorithm based on Flink, which is one of the in-memory cluster computing platforms, and graphics processing units (GPUs). Several optimizations are adopted to improve the performance, such as cache-based scheme, reasonable partitioning strategy, memory mapping scheme for mapping specific Java virtual machine objects to buffers. Most importantly, our proposed framework for utilizing GPUs to accelerate Flink for big data is general. This framework can be utilized to accelerate many other variants of ELM and other machine learning algorithms. To the best of our knowledge, it is the first kind of library, which combines in-memory cluster computing with GPUs to parallelize H-ELM. The experimental results have demonstrated that our proposed GPU-accelerated parallel H-ELM named as GPH-ELM can efficiently process large-scale DSTs with good performance of speedup and scalability, leveraging the computing power of both CPUs and GPUs in the cluster.

引用

页码：2740 / 2753

页数：14

共 50 条

[11] GPU-Accelerated Machine Learning Inference as a Service for Computing in Neutrino Experiments
Wang, Michael
Yang, Tingjun
Flechas, Maria Acosta
Harris, Philip
Hawks, Benjamin
Holzman, Burt
Knoepfel, Kyle
Krupa, Jeffrey
Pedro, Kevin
Tran, Nhan
[J]. FRONTIERS IN BIG DATA, 2021, 3
[12] GPU-Accelerated Machine Learning in Non-Orthogonal Multiple Access
Schaeufele, Daniel
Marcus, Guillermo
Binder, Nikolaus
Mehlhose, Matthias
Keller, Alexander
Stanczak, Slawomir
[J]. 2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 667 - 671
[13] A Parallel Multiclassification Algorithm for Big Data Using an Extreme Learning Machine
Duan, Mingxing
Li, Kenli
Liao, Xiangke
Li, Keqin
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2337 - 2351
[14] Model sharing for GPU-accelerated DNN inference in big data processing systems
Ding G.
Chen Q.
Xu C.
Qian W.
Zhou A.
[J]. Qinghua Daxue Xuebao/Journal of Tsinghua University, 2022, 62 (09): : 1435 - 1441
[15] GPU-Accelerated Parallel FDTD on Distributed Heterogeneous Platform
Jiang, Ronglin
Jiang, Shugang
Zhang, Yu
Xu, Ying
Xu, Lei
Zhang, Dandan
[J]. INTERNATIONAL JOURNAL OF ANTENNAS AND PROPAGATION, 2014, 2014
[16] A GPU-accelerated parallel K-means algorithm
Cuomo, S.
De Angelis, V.
Farina, G.
Marcellino, L.
Toraldo, G.
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2019, 75 : 262 - 274
[17] Measuring GPU-Accelerated Parallel SVM Performance Using Large Datasets for Multi-Class Machine Learning Problem
Bin Sulaiman, Muhamad Abdul Hay
Suliman, Azizah
Ahmad, Abdul Rahim
[J]. PROCEEDINGS OF THE 2014 6TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND MULTIMEDIA (ICIM), 2014, : 299 - 302
[18] GPU-Accelerated RDP Algorithm for Data Segmentation
Cebrian, Pau
Moure, Juan Carlos
[J]. COMPUTATIONAL SCIENCE - ICCS 2020, PT I, 2020, 12137 : 234 - 247
[19] GPU-Accelerated Visualization of Scattered Point Data
Falch, Thomas L.
Floystad, Jostein Bo
Breiby, Dag W.
Elster, Anne C.
[J]. IEEE ACCESS, 2013, 1 : 564 - 576
[20] GPU-Accelerated Mahalanobis-Average Hierarchical Clustering Analysis
Smelko, Adam
Kratochvil, Miroslav
Krulis, Martin
Sieger, Tomas
[J]. EURO-PAR 2021: PARALLEL PROCESSING, 2021, 12820 : 580 - 595

← 1 2 3 4 5 →