Cerebro: A Data System for Optimized Deep Learning Model Selection

被引:35
|
作者
Nakandala, Supun [1 ]
Zhang, Yuhao [1 ]
Kumar, Arun [1 ]
机构
[1] Univ Calif San Diego, San Diego, CA 92103 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2020年 / 13卷 / 11期
基金
美国国家科学基金会;
关键词
INFERENCE;
D O I
10.14778/3407790.3407816
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks (deep nets) are revolutionizing many machine learning (ML) applications. But there is a major bottleneck to wider adoption: the pain and resource intensiveness of model selection. This empirical process involves exploring deep net architectures and hyper-parameters, often requiring hundreds of trials. Alas, most ML systems focus on training one model at a time, reducing throughput and raising overall resource costs; some also sacrifice reproducibility. We present Cerebro, a new data system to raise deep net model selection throughput at scale without raising resource costs and without sacrificing reproducibility or accuracy. Cerebro uses a new parallel SGD execution strategy we call model hopper parallelism that hybridizes task- and data-parallelism to mitigate the cons of these prior paradigms and offer the best of both worlds. Experiments on large ML benchmark datasets show that Cerebro offers 3x to 10x runtime savings relative to data-parallel systems like Horovod and Parameter Server and up to 8x memory/storage savings or up to 100x network savings relative to task-parallel systems. Cerebro also supports heterogeneous resources and fault tolerance.
引用
收藏
页码:2159 / 2173
页数:15
相关论文
共 50 条
  • [1] Cerebro: A Data System for Optimized Deep Learning Model Selection (vol 13, pg 2159, 2020)
    Nakandala, Supun
    Zhang, Yuhao
    Kumar, Arun
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (06): : 863 - 863
  • [2] CEREBRO: Efficient and Reproducible Model Selection on Deep Learning Systems
    Nakandala, Supun
    Zhang, Yuhao
    Kumar, Arun
    PROCEEDINGS OF THE 3RD INTERNATIONAL WORKSHOP ON DATA MANAGEMENT FOR END-TO-END MACHINE LEARNING, DEEM 2019, 2019,
  • [3] A Feature Optimized Deep Learning Model for Clinical Data Mining
    Wu Tianshu
    Chen Shuyu
    Tian Yingming
    Wu Peng
    CHINESE JOURNAL OF ELECTRONICS, 2020, 29 (03) : 476 - 481
  • [4] A Feature Optimized Deep Learning Model for Clinical Data Mining
    WU Tianshu
    CHEN Shuyu
    TIAN Yingming
    WU Peng
    ChineseJournalofElectronics, 2020, 29 (03) : 476 - 481
  • [5] Deep Optimized Broad Learning System for Applications in Tabular Data Recognition
    Zhang, Wandong
    Yang, Yimin
    Wu, Q. M. Jonathan
    Liu, Tianlong
    IEEE TRANSACTIONS ON CYBERNETICS, 2024, 54 (12) : 7119 - 7132
  • [6] Deep Optimized Broad Learning System for Applications in Tabular Data Recognition
    Zhang, Wandong
    Yang, Yimin
    Wu, Q. M. Jonathan
    Liu, Tianlong
    IEEE TRANSACTIONS ON CYBERNETICS, 2024,
  • [7] Deep Learning Intrusion Detection Model Based on Optimized Imbalanced Network Data
    Zhang, Yan
    Zhang, Hongmei
    Zhang, Xiangli
    Qi, Dongsheng
    2018 IEEE 18TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2018, : 1128 - 1132
  • [8] Study on Feature Selection and Feature Deep Learning Model For Big Data
    Yu, Ping
    Yan, Hui
    2018 3RD INTERNATIONAL CONFERENCE ON SMART CITY AND SYSTEMS ENGINEERING (ICSCSE), 2018, : 792 - 795
  • [9] A weight optimized deep learning model for cluster based intrusion detection system
    Godala, Sravanthi
    Kumar, M. Sunil
    OPTICAL AND QUANTUM ELECTRONICS, 2023, 55 (14)
  • [10] Heart Disease Prediction Model Using Feature Selection and Ensemble Deep Learning with Optimized Weight
    Al-Mahdi, Iman S.
    Darwish, Saad M.
    Madbouly, Magda M.
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2025,