A Survey on Large-Scale Machine Learning

被引:50
|
作者
Wang, Meng [1 ,2 ]
Fu, Weijie [1 ,2 ]
He, Xiangnan [3 ]
Hao, Shijie [1 ,2 ]
Wu, Xindong [1 ,2 ]
机构
[1] Hefei Univ Technol, Key Lab Knowledge Engn Big Data, Minist Educ, Hefei 230601, Anhui, Peoples R China
[2] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230601, Anhui, Peoples R China
[3] Univ Sci & Technol China, Hefei 230031, Anhui, Peoples R China
关键词
Machine learning; Computational modeling; Optimization; Predictive models; Big Data; Computational complexity; Large-scale machine learning; efficient machine learning; big data analysis; efficiency; survey; GRAPH CONSTRUCTION; BIG DATA; OPTIMIZATION; ALGORITHMS;
D O I
10.1109/TKDE.2020.3015777
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However, most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data. This issue calls for the need of Large-scale Machine Learning (LML), which aims to learn patterns from big data with comparable performance efficiently. In this paper, we offer a systematic survey on existing LML methods to provide a blueprint for the future developments of this area. We first divide these LML methods according to the ways of improving the scalability: 1) model simplification on computational complexities, 2) optimization approximation on computational efficiency, and 3) computation parallelism on computational capabilities. Then we categorize the methods in each perspective according to their targeted scenarios and introduce representative methods in line with intrinsic strategies. Lastly, we analyze their limitations and discuss potential directions as well as open issues that are promising to address in the future.
引用
收藏
页码:2574 / 2594
页数:21
相关论文
共 50 条
  • [21] Introduction to Special Issue on Large-Scale Machine Learning
    Hsu, Chun-Nan
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
  • [22] A review of Nystrom methods for large-scale machine learning
    Sun, Shiliang
    Zhao, Jing
    Zhu, Jiang
    INFORMATION FUSION, 2015, 26 : 36 - 48
  • [23] Large-scale machine learning for metagenomics sequence classification
    Vervier, Kevin
    Mahe, Pierre
    Tournoud, Maud
    Veyrieras, Jean-Baptiste
    Vert, Jean-Philippe
    BIOINFORMATICS, 2016, 32 (07) : 1023 - 1032
  • [24] Large-Scale Strategic Games and Adversarial Machine Learning
    Alpcan, Tansu
    Rubinstein, Benjamin I. P.
    Leckie, Christopher
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 4420 - 4426
  • [25] Dynamic Control Flow in Large-Scale Machine Learning
    Yu, Yuan
    Abadi, Martin
    Barham, Paul
    Brevdo, Eugene
    Burrows, Mike
    Davis, Andy
    Dean, Jeff
    Ghemawat, Sanjay
    Harley, Tim
    Hawkins, Peter
    Isard, Michael
    Kudlur, Manjunath
    Monga, Rajat
    Murray, Derek
    Zheng, Xiaoqiang
    EUROSYS '18: PROCEEDINGS OF THE THIRTEENTH EUROSYS CONFERENCE, 2018,
  • [26] Large-Scale Machine Learning with Stochastic Gradient Descent
    Bottou, Leon
    COMPSTAT'2010: 19TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL STATISTICS, 2010, : 177 - 186
  • [27] Large-Scale Machine Learning Approaches for Molecular Biophysics
    Ramanathan, Arvind
    Chennubhotla, Chakra S.
    Agarwal, Pratul K.
    Stanley, Christopher B.
    BIOPHYSICAL JOURNAL, 2015, 108 (02) : 370A - 370A
  • [28] Large-Scale Machine Learning at Verizon: Theory and Applications
    Srivastava, Ashok
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 417 - 417
  • [29] Compressed linear algebra for large-scale machine learning
    Elgohary, Ahmed
    Boehm, Matthias
    Haas, Peter J.
    Reiss, Frederick R.
    Reinwald, Berthold
    VLDB JOURNAL, 2018, 27 (05): : 719 - 744
  • [30] Quick extreme learning machine for large-scale classification
    Albtoush, Audi
    Fernandez-Delgado, Manuel
    Cernadas, Eva
    Barro, Senen
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (08): : 5923 - 5938