GPU-Accelerated Parallel Hierarchical Extreme Learning Machine on Flink for Big Data

被引：81

作者：

Chen, Cen ^{[1
,2
]}

Li, Kenli ^{[1
,2
]}

Ouyang, Aijia ^{[1
,2
,3
]}

Tang, Zhuo ^{[1
,2
]}

Li, Keqin ^{[1
,2
,4
]}

机构：

[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China

[2] Natl Supercomp Ctr, Changsha 410082, Hunan, Peoples R China

[3] Zunyi Normal Coll, Dept Informat Engn, Zunyi 563006, Peoples R China

[4] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA

来源：

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2017年 / 47卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Big data; deep learning (DL); Flink; GPGPU; hierarchical extreme learning machine (H-ELM); parallel; FEEDFORWARD NETWORKS; HIDDEN NODES; MAPREDUCE; APPROXIMATION; CLASSIFICATION; OPTIMIZATION; REGRESSION; ALGORITHM; SPMV;

D O I：

10.1109/TSMC.2017.2690673

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The extreme learning machine (ELM) has become one of the most important and popular algorithms of machine learning, because of its extremely fast training speed, good generalization, and universal approximation/classification capability. The proposal of hierarchical ELM (H-ELM) extends ELM from single hidden layer feedforward networks to multilayer perceptron, greatly strengthening the applicability of ELM. Generally speaking, during training H-ELM, large-scale datasets (DSTs) are needed. Therefore, how to make use of H-ELM framework in processing big data is worth further exploration. This paper proposes a parallel H-ELM algorithm based on Flink, which is one of the in-memory cluster computing platforms, and graphics processing units (GPUs). Several optimizations are adopted to improve the performance, such as cache-based scheme, reasonable partitioning strategy, memory mapping scheme for mapping specific Java virtual machine objects to buffers. Most importantly, our proposed framework for utilizing GPUs to accelerate Flink for big data is general. This framework can be utilized to accelerate many other variants of ELM and other machine learning algorithms. To the best of our knowledge, it is the first kind of library, which combines in-memory cluster computing with GPUs to parallelize H-ELM. The experimental results have demonstrated that our proposed GPU-accelerated parallel H-ELM named as GPH-ELM can efficiently process large-scale DSTs with good performance of speedup and scalability, leveraging the computing power of both CPUs and GPUs in the cluster.

引用

页码：2740 / 2753

页数：14

共 50 条

[41] GPU-Accelerated Data-driven Framework of Hybrid ReaxFF
Wang, Xing Quan
Lau, Denvid
[J]. NONDESTRUCTIVE CHARACTERIZATION AND MONITORING OF ADVANCED MATERIALS, AEROSPACE, CIVIL INFRASTRUCTURE, AND TRANSPORTATION XVIII, 2024, 12950
[42] Impact of data layouts on the efficiency of GPU-accelerated IDW interpolation
Mei, Gang
Tian, Hong
[J]. SPRINGERPLUS, 2016, 5 : 1 - 18
[43] Improvements of classification accuracy of film defects by using GPU-accelerated image processing and machine learning frameworks
Ando, Hidetoshi
Niitsu, Yuki
Hirasawa, Masaki
Teduka, Hiroaki
Yajima, Masao
[J]. PROCEEDINGS NICOGRAPH INTERNATIONAL 2016, 2016, : 83 - 87
[44] HeAT - a Distributed and GPU-accelerated Tensor Framework for Data Analytics
Goetz, Markus
Debus, Charlotte
Coquelin, Daniel
Krajsek, Kai
Comito, Claudia
Knechtges, Philipp
Hagemeier, Bjorn
Tarnawa, Michael
Hanselmann, Simon
Siggel, Martin
Basermann, Achim
Streit, Achim
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 276 - 287
[45] Distributed Weighted Extreme Learning Machine for Big Imbalanced Data Learning
Wang, Zhiqiong
Xin, Junchang
Tian, Shuo
Yu, Ge
[J]. PROCEEDINGS OF ELM-2015, VOL 1: THEORY, ALGORITHMS AND APPLICATIONS (I), 2016, 6 : 319 - 332
[46] Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning
Wang, Zhiqiong
Xin, Junchang
Yang, Hongxu
Tian, Shuo
Yu, Ge
Xu, Chenren
Yao, Yudong
[J]. TSINGHUA SCIENCE AND TECHNOLOGY, 2017, 22 (02) : 160 - 173
[47] Distributed and Weighted Extreme Learning Machine for Imbalanced Big Data Learning
Zhiqiong Wang
Junchang Xin
Hongxu Yang
Shuo Tian
Ge Yu
Chenren Xu
Yudong Yao
[J]. Tsinghua Science and Technology, 2017, 22 (02) : 160 - 173
[48] A GPU-accelerated adaptive kernel density estimation approach for efficient point pattern analysis on spatial big data
Zhang, Guiming
Zhu, A-Xing
Huang, Qunying
[J]. INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2017, 31 (10) : 2068 - 2097
[49] THE FAILURE ANALYSIS OF EXTREME LEARNING MACHINE ON BIG DATA AND THE COUNTERMEASURE
Zhang, Pei-Zhou
Zhao, Shi-Xin
Wang, Xi-Zhao
[J]. PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOL. 2, 2015, : 849 - 853
[50] Porting and scaling OpenACC applications on massively-parallel, GPU-accelerated supercomputers
A. Hart
R. Ansaloni
A. Gray
[J]. The European Physical Journal Special Topics, 2012, 210 : 5 - 16

← 1 2 3 4 5 →