Accelerating DNN Architecture Search at Scale Using Selective Weight Transfer

被引:4
|
作者
Liu, Hongyuan [1 ]
Nicolae, Bogdan [2 ]
Di, Sheng [2 ]
Cappello, Franck [2 ]
Jog, Adwait [1 ]
机构
[1] William & Mary, Williamsburg, VA 23185 USA
[2] Argonne Natl Lab, Lemont, IL USA
关键词
Deep Learning; Neural Architecture Search; Checkpointing;
D O I
10.1109/Cluster48925.2021.00051
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning applications are rapidly gaining traction both in industry and scientific computing. Unsurprisingly, there has been significant interest in adopting deep learning at a very large scale on supercomputing infrastructures for a variety of scientific applications. A key issue in this context is how to find an appropriate model architecture that is suitable to solve the problem. We call this the neural architecture search (NAS) problem. Over time, many automated approaches have been proposed that can explore a large number of candidate models. However, this remains a time-consuming and resource expensive process: the candidates are often trained from scratch for a small number of epochs in order to obtain a set of top-K best performers, which are fully trained in a second phase. To address this problem, we propose a novel method that leverages checkpoints of previously discovered candidates to accelerate NAS. Based on the observation that the candidates feature high structural similarity, we propose the idea that new candidates need not be trained starting from random weights, but rather from the weights of similar layers of previously evaluated candidates. Thanks to this approach, the convergence of the candidate models can be significantly accelerated and produces candidates that are statistically better based on the objective metrics. Furthermore, once the top-K models are identified, our approach provides a significant speed-up (1.4 similar to 1.5x on the average) for the full training.
引用
收藏
页码:82 / 93
页数:12
相关论文
共 50 条
  • [1] EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search
    Jiemin FANG
    Yukang CHEN
    Xinbang ZHANG
    Qian ZHANG
    Chang HUANG
    Gaofeng MENG
    Wenyu LIU
    Xinggang WANG
    Science China(Information Sciences), 2021, 64 (09) : 103 - 115
  • [2] EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search
    Fang, Jiemin
    Chen, Yukang
    Zhang, Xinbang
    Zhang, Qian
    Huang, Chang
    Meng, Gaofeng
    Liu, Wenyu
    Wang, Xinggang
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (09)
  • [3] EAT-NAS: elastic architecture transfer for accelerating large-scale neural architecture search
    Jiemin Fang
    Yukang Chen
    Xinbang Zhang
    Qian Zhang
    Chang Huang
    Gaofeng Meng
    Wenyu Liu
    Xinggang Wang
    Science China Information Sciences, 2021, 64
  • [4] Accelerating DNN Training Through Selective Localized Learning
    Krithivasan, Sarada
    Sen, Sanchari
    Venkataramani, Swagath
    Raghunathan, Anand
    FRONTIERS IN NEUROSCIENCE, 2022, 15
  • [5] Optimal DNN architecture search using Bayesian Optimization Hyperband for arrhythmia detection
    Han, Seungwoo
    Eom, Heesang
    Kim, Juhyeong
    Park, Cheolsoo
    2020 IEEE WIRELESS POWER TRANSFER CONFERENCE (WPTC), 2020, : 357 - 360
  • [6] Accelerating multi-objective neural architecture search by random-weight evaluation
    Shengran Hu
    Ran Cheng
    Cheng He
    Zhichao Lu
    Jing Wang
    Miao Zhang
    Complex & Intelligent Systems, 2023, 9 : 1183 - 1192
  • [7] Accelerating multi-objective neural architecture search by random-weight evaluation
    Hu, Shengran
    Cheng, Ran
    He, Cheng
    Lu, Zhichao
    Wang, Jing
    Zhang, Miao
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (02) : 1183 - 1192
  • [8] TENG: A General-Purpose and Efficient Processor Architecture for Accelerating DNN
    Zhang, Zekun
    Cai, Yujie
    Liao, Tianjiao
    Xu, Chengyu
    Jiao, Xin
    2024 IEEE 6TH INTERNATIONAL CONFERENCE ON AI CIRCUITS AND SYSTEMS, AICAS 2024, 2024, : 149 - 153
  • [9] ComposableWorkflow for Accelerating Neural Architecture Search Using In Situ Analytics for Protein Classification
    Charming, Georgia
    Patel, Ria
    Olaya, Paula
    Rorabaugh, Ariel Keller
    Miyashita, Osamu
    Caino-Lores, Silvina
    Schuman, Catherine
    Tama, Florence
    Taufer, Michela
    PROCEEDINGS OF THE 52ND INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2023, 2023, : 756 - 765
  • [10] A Unified Architecture for Accelerating Distributed DNN Training in Heterogeneous GPU/CPU Clusters
    Jiang, Yimin
    Zhu, Yibo
    Lan, Chang
    Yi, Bairen
    Cui, Yong
    Guo, Chuanxiong
    PROCEEDINGS OF THE 14TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDI '20), 2020, : 463 - 479