Large-Scale Learning with AdaGrad on Spark

被引:0
|
作者
Hadgu, Asmelash Teka [1 ]
Nigam, Aastha [2 ]
Diaz-Aviles, Ernesto [3 ]
机构
[1] L3S Res Ctr, Hannover, Germany
[2] Univ Notre Dame, Indiana, PA USA
[3] IBM Res, Dublin, Ireland
关键词
Distributed machine learning; Adaptive gradient; Spark;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Stochastic Gradient Descent (SGD) is a simple yet very efficient online learning algorithm for optimizing convex (and often non-convex) functions and one of the most popular stochastic optimization methods in machine learning today. One drawback of SGD is that it is sensitive to the learning rate hyper-parameter. The Adaptive Sub-gradient Descent, AdaGrad, dynamically incorporates knowledge of the geometry of the data observed in earlier iterations to calculate a different learning rate for every feature. In this work, we implement a distributed version of AdaGrad for large-scale machine learning tasks using Apache Spark. Apache Spark is a fast cluster computing engine that provides similar scalability and fault tolerance properties to MapReduce, but in contrast to Hadoop's two-stage disk-based MapReduce paradigm, Spark's multi-stage in-memory primitives allow user programs to load data into a cluster's memory and query it repeatedly, which makes it ideal for building scalable machine learning applications. We empirically evaluate our implementation on large-scale real-world problems in the machine learning canonical tasks of classification and regression. Comparing our implementation of AdaGrad with the SGD scheduler currently available in Spark's Machine Learning Library (MLlib), we experimentally show that AdaGrad saves time by avoiding manually setting a learning-rate hyperparameter, converges fast and can even achieve better generalization errors.
引用
收藏
页码:2828 / 2830
页数:3
相关论文
共 50 条
  • [21] Efficient Processing of Recursive Joins on Large-Scale Datasets in Spark
    Thuong-Cang Phan
    Anh-Cang Phan
    Thi-To-Quyen Tran
    Ngoan-Thanh Trieu
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING (ICCSAMA 2019), 2020, 1121 : 391 - 402
  • [22] Particle Swarm Optimization for Large-Scale Clustering on Apache Spark
    Sherar, Matthew
    Zulkernine, Farhana
    [J]. 2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 801 - 808
  • [23] A Theoretical and Experimental Comparison of Large-Scale Join Algorithms in Spark
    Phan A.-C.
    Phan T.-C.
    Trieu T.-N.
    Tran T.-T.-Q.
    [J]. SN Computer Science, 2021, 2 (5)
  • [24] Accelerating Relevance Vector Machine for Large-Scale Data on Spark
    Liu, Fang
    Zhong, Hao
    Li, Si-Han
    [J]. 4TH ANNUAL INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND APPLICATIONS (ITA 2017), 2017, 12
  • [25] GeoMatch: Efficient Large-scale Map Matching on Apache Spark
    Zeidan, Ayman
    Lagerspetz, Eemil
    Zhao, Kai
    Nurmi, Petteri
    Tarkoma, Sasu
    Vo, Huy T.
    [J]. ACM/IMS Transactions on Data Science, 2020, 1 (03):
  • [26] Learning in a large-scale pervasive environment
    Barbosa, BNF
    Yamim, AC
    Augustin, I
    da Silva, LC
    Geyer, CFR
    Barbosa, JLV
    [J]. FOURTH ANNUAL IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS, PROCEEDINGS, 2006, : 226 - +
  • [27] Large-scale Graph Representation Learning
    Leskovec, Jure
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 4 - 4
  • [28] Sequential Tests for Large-Scale Learning
    Korattikara, Anoop
    Chen, Yutian
    Welling, Max
    [J]. NEURAL COMPUTATION, 2016, 28 (01) : 45 - 70
  • [29] Efficient Large-Scale Structured Learning
    Branson, Steve
    Beijbom, Oscar
    Belongie, Serge
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 1806 - 1813
  • [30] Large-scale SVD and manifold learning
    [J]. 1600, Microtome Publishing (14):