CURIOUS: Efficient Neural Architecture Search Based on a Performance Predictor and Evolutionary Search

被引:4
|
作者
Hassantabar, Shayan [1 ]
Dai, Xiaoliang [2 ]
Jha, Niraj K. [1 ]
机构
[1] Princeton Univ, Dept Elect & Comp Engn, Princeton, NJ 08544 USA
[2] Facebook Mobile Vis, Campton, CA USA
关键词
Compression; convolutional neural network (CNN); deep learning; dimensionality reduction; long short-term memory (LSTM); neural architecture search (NAS); transformer;
D O I
10.1109/TCAD.2022.3148202
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Neural networks (NNs) have been successfully deployed in various applications of artificial intelligence. However, architectural design of NNs is still a challenging problem. This is due to the need to navigate a search space based on a large number of hyperparameters. This forces the search space of possible architectures to grow exponentially. Using a trial-and-error design approach is very time consuming and leads to suboptimal architectures. In addition, approaches, such as neural architecture search based on reinforcement learning and differentiable gradient-based architecture search, often incur huge computational costs or significant memory requirements. To address these challenges, we propose the CURIOUS NN synthesis methodology. It uses a performance predictor to efficiently navigate the architectural search space with an evolutionary search process. The predictor is built using quasi Monte-Carlo sampling, boosted decision tree regression, and an intelligent iterative sampling method. It is designed to be sample efficient. CURIOUS starts from a base architecture and explores the architectural search space to obtain a variant of the base architecture with the highest performance. This search framework is general and covers all important NN architecture types, e.g., feedforward NNs (FFNNs), convolutional NNs (CNNs), recurrent NNs (RNNs), and transformers. We evaluate the performance of CURIOUS on various datasets and base architectures. Through these experiments, we demonstrate significant performance improvements over the baseline architectures. For theMNIST dataset, our CNN architecture achieves an error rate of 0.66%, with 8.6x fewer parameters compared to the LeNet-5 baseline. For the CIFAR-10 dataset, we use the ResNet architectures and residual networks with Shake-Shake regularization as the baselines. Our synthesized ResNet-18 has a 2.52% accuracy improvement over the original ResNet-18, 1.74% over ResNet-101, and 0.16% over ResNet-1001, while requiring comparable number of parameters and floating-point operations to the original ResNet-18. This result shows that instead of just increasing the number of layers to increase accuracy, an alternative is to use a better NN architecture with a small number of layers. In addition, CURIOUS achieves an error rate of just 2.69% with a variant of the residual architecture with Shake-Shake regularization. We also use the set of optimized hyperparameters found for ResNet-18 on the CIFAR-10 dataset to train and evaluate the model on the ImageNet dataset, and show 3.43% (1.83%) improvement in the top-1 (top-5) error rate compared to the original ResNet-18 model. CURIOUS also obtains the highest accuracy for various other FFNNs that are geared toward edge devices and IoT sensors. In addition, we use CURIOUS to search for deep RNN architectures for the SICK dataset for sentence similarity evaluation. It achieves a mean-squared error of only 0.2060, improving upon the base network performance, without the need to stack multiple long short-term memories. We also use CURIOUS to search for a better NN classifier for the sentiment analysis task on the Stanford sentiment treebank dataset using a pretrained BERT model and again demonstrate improvements in performance.
引用
收藏
页码:4975 / 4990
页数:16
相关论文
共 50 条
  • [1] Efficient evolutionary neural architecture search based on hybrid search space
    Gong, Tao
    Ma, Yongjie
    Xu, Yang
    Song, Changwei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (08) : 3313 - 3326
  • [2] Guided evolutionary neural architecture search with efficient performance estimation
    Lopes, Vasco
    Santos, Miguel
    Degardin, Bruno
    Alexandre, Luis A.
    NEUROCOMPUTING, 2024, 584
  • [3] Evolutionary Neural Architecture Search with Predictor of Ranking-Based Score
    Jiang, Peng-Cheng
    Xue, Yu
    Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (11): : 2522 - 2535
  • [4] Score Predictor-Assisted Evolutionary Neural Architecture Search
    Jiang, Pengcheng
    Xue, Yu
    Neri, Ferrante
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025,
  • [5] PRE-NAS: Evolutionary Neural Architecture Search With Predictor
    Peng, Yameng
    Song, Andy
    Ciesielski, Vic
    Fayek, Haytham M. M.
    Chang, Xiaojun
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2023, 27 (01) : 26 - 36
  • [6] Neural predictor based quantum architecture search
    Zhang, Shi-Xin
    Hsieh, Chang-Yu
    Zhang, Shengyu
    Yao, Hong
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2021, 2 (04):
  • [7] Evolutionary Neural Architecture Search with Semi-supervised Accuracy Predictor
    Xiao, Songyi
    Zhao, Bo
    Liu, Derong
    2022 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2022, : 28 - 32
  • [8] Fast Evolutionary Neural Architecture Search by Contrastive Predictor with Linear Regions
    Peng, Yameng
    Song, Andy
    Ciesielski, Vic
    Fayek, Haytham M.
    Chang, Xiaojun
    PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2023, 2023, : 1257 - 1266
  • [9] Efficient evolutionary neural architecture search by modular inheritable crossover
    He, Cheng
    Tan, Hao
    Huang, Shihua
    Cheng, Ran
    SWARM AND EVOLUTIONARY COMPUTATION, 2021, 64
  • [10] Efficient Self-learning Evolutionary Neural Architecture Search
    Qiu, Zhengzhong
    Bi, Wei
    Xu, Dong
    Guo, Hua
    Ge, Hongwei
    Liang, Yanchun
    Lee, Heow Pueh
    Wu, Chunguo
    APPLIED SOFT COMPUTING, 2023, 146