FTSGD: An Adaptive Stochastic Gradient Descent Algorithm for Spark MLlib

被引：2

作者：

Zhang, Hong ^{[1
]}

Liu, Zixia ^{[1
]}

Huang, Hai ^{[2
]}

Wang, Liqiang ^{[1
]}

机构：

[1] Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA

[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA

来源：

2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH) | 2018年

关键词：

Spark; MLlib; Asynchronous Stochastic Gradient Decent; Adaptive Iterative Learning;

D O I：

10.1109/DASC/PiCom/DataCom/CyberSciTec.2018.00-22

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The proliferation of massive datasets and the surge of interests in big data analytics have popularized a number of novel distributed data processing platforms such as Hadoop and Spark. Their large and growing ecosystems of libraries enable even novice to take advantage of the latest data analytics and machine learning algorithms. However, time-consuming data synchronization and communications in iterative algorithms on large-scale distributed platforms can lead to significant performance inefficiency. MLlib is Spark's scalable library consisting of common machine learning algorithms, many of which employ Stochastic Gradient Descent (SGD) to find minima or maxima by iterations. However, the convergence can be very slow if gradient data are synchronized on each iteration. In this work, we optimize the current implementation of SGD in Spark's MLlib by reusing data partition for multiple times within a single iteration to find better candidate weights in a more efficient way. Whether using multiple local iterations within each partition is dynamically decided by the 68-95-99.7 rule. We also design a variant of momentum algorithm to optimize step size in every iteration. This method uses a new adaptive rule that decreases the step size whenever neighboring gradients show differing directions of significance. Experiments show that our adaptive algorithm is more efficient and can be 7 times faster compared to the original MLlib's SGD.

引用

页码：828 / 835

页数：8

共 50 条

[31] Adaptive Beamforming Based On Stochastic Parallel Gradient Descent Algorithm For Single Receiver Phased Array
Zhao, Haijun
Zhang, Jing
Yin, Zhiping
2014 2ND INTERNATIONAL CONFERENCE ON SYSTEMS AND INFORMATICS (ICSAI), 2014, : 849 - 853
[32] Tip-tilt adaptive correction based on stochastic parallel gradient descent optimization algorithm
Ma, Huimin
Zhang, Pengfei
Zhang, Jinghui
Qiao, Chunhong
Fan, Chengyu
OPTICAL DESIGN AND TESTING IV, 2010, 7849
[33] A Nonlinear PID-Incorporated Adaptive Stochastic Gradient Descent Algorithm for Latent Factor Analysis
Li, Jinli
Luo, Xin
Yuan, Ye
Gao, Shangce
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (03) : 3742 - 3756
[34] Stochastic Gradient Descent on a Tree: an Adaptive and Robust Approach to Stochastic Convex Optimization
Vakili, Sattar
Salgia, Sudeep
Zhao, Qing
2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 432 - 438
[35] Adaptive Sampling for Incremental Optimization Using Stochastic Gradient Descent
Papa, Guillaume
Bianchi, Pascal
Clemencon, Stephan
ALGORITHMIC LEARNING THEORY, ALT 2015, 2015, 9355 : 317 - 331
[36] Laser beam shaping based on wavefront sensorless adaptive optics with stochastic parallel gradient descent algorithm
Li, Yan
Peng, Tairan
Li, Wenlai
Han, Hongming
Ma, Jianqiang
14TH NATIONAL CONFERENCE ON LASER TECHNOLOGY AND OPTOELECTRONICS (LTO 2019), 2019, 11170
[37] STOCHASTIC GRADIENT DESCENT ALGORITHM FOR STOCHASTIC OPTIMIZATION IN SOLVING ANALYTIC CONTINUATION PROBLEMS
Bao, Feng
Maier, Thomas
FOUNDATIONS OF DATA SCIENCE, 2020, 2 (01): : 1 - 17
[38] Stability and Generalization of the Decentralized Stochastic Gradient Descent Ascent Algorithm
Zhu, Miaoxi
Shen, Li
Du, Bo
Tao, Dacheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[39] Analysis on residual error for adaptive optical system based on stochastic parallel gradient descent control algorithm
Zhou P.
Wang X.
Ma Y.
Ma H.
Xu X.
Liu Z.
Guangxue Xuebao/Acta Optica Sinica, 2010, 30 (03): : 631 - 617
[40] Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
Deanna Needell
Nathan Srebro
Rachel Ward
Mathematical Programming, 2016, 155 : 549 - 573

← 1 2 3 4 5 →