Neural architecture search via standard machine learning methodologies

被引:11
|
作者
Franchini, Giorgia [1 ,2 ]
Ruggiero, Valeria [1 ]
Porta, Federica [2 ]
Zanni, Luca [2 ]
机构
[1] Univ Ferrara, Dept Math & Comp Sci, Via Machiavelli 30, I-44121 Ferrara, FE, Italy
[2] Univ Modena & Reggio Emilia, Dept Phys Informat & Math, Via Campi 213-B, I-41121 Modena, MO, Italy
来源
MATHEMATICS IN ENGINEERING | 2022年 / 5卷 / 01期
关键词
convolutional neural network; neural architecture search; automated machine learning; support vector machines for regression; ensemble methods; NAS-Bench dataset;
D O I
10.3934/mine.2023012
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In the context of deep learning, the more expensive computational phase is the full training of the learning methodology. Indeed, its effectiveness depends on the choice of proper values for the so-called hyperparameters, namely the parameters that are not trained during the learning process, and such a selection typically requires an extensive numerical investigation with the execution of a significant number of experimental trials. The aim of the paper is to investigate how to choose the hyperparameters related to both the architecture of a Convolutional Neural Network (CNN), such as the number of filters and the kernel size at each convolutional layer, and the optimisation algorithm employed to train the CNN itself, such as the steplength, the mini-batch size and the potential adoption of variance reduction techniques. The main contribution of the paper consists in introducing an automatic Machine Learning technique to set these hyperparameters in such a way that a measure of the CNN performance can be optimised. In particular, given a set of values for the hyperparameters, we propose a low-cost strategy to predict the performance of the corresponding CNN, based on its behavior after only few steps of the training process. To achieve this goal, we generate a dataset whose input samples are provided by a limited number of hyperparameter configurations together with the corresponding CNN measures of performance obtained with only few steps of the CNN training process, while the label of each input sample is the performance corresponding to a complete training of the CNN. Such dataset is used as training set for a Support Vector Machines for Regression and/or Random Forest techniques to predict the performance of the considered learning methodology, given its performance at the initial iterations of its learning process. Furthermore, by a probabilistic exploration of the hyperparameter space, we are able to find, at a quite low cost, the setting of a CNN hyperparameters which provides the optimal performance. The results of an extensive numerical experimentation, carried out on CNNs, together with the use of our performance predictor with NAS-Bench-101, highlight how the proposed methodology for the hyperparameter setting appears very promising.
引用
收藏
页码:1 / 21
页数:21
相关论文
共 50 条
  • [1] Neural architecture search via standard machine learning methodologies
    Franchini G.
    Ruggiero V.
    Porta F.
    Zanni L.
    [J]. Mathematics In Engineering, 2023, 5 (01):
  • [2] Metaheuristics and machine learning: an approach with reinforcement learning assisting neural architecture search
    Venske, Sandra Mara Scos
    de Almeida, Carolina Paula
    Delgado, Myriam Regattieri
    [J]. JOURNAL OF HEURISTICS, 2024, 30 (3-4) : 199 - 224
  • [3] Approximate Neural Architecture Search via Operation Distribution Learning
    Wan, Xingchen
    Ru, Binxin
    Esparanca, Pedro M.
    Carlucci, Fabio M.
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3545 - 3554
  • [4] Reprogramming Architecture Learning via Practical Methodologies
    Erdine, Elif
    Kallegias, Alexandros
    [J]. FUSION: DATA INTEGRATION AT ITS BEST, VOL 1, 2014, : 373 - 380
  • [5] Across-task neural architecture search via meta learning
    Rong, Jingtao
    Yu, Xinyi
    Zhang, Mingyang
    Ou, Linlin
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (03) : 1003 - 1019
  • [6] Across-task neural architecture search via meta learning
    Jingtao Rong
    Xinyi Yu
    Mingyang Zhang
    Linlin Ou
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 1003 - 1019
  • [7] Multiprior Learning via Neural Architecture Search for Blind Face Restoration
    Yu, Yanjiang
    Zhang, Puyang
    Zhang, Kaihao
    Luo, Wenhan
    Li, Changsheng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, : 1 - 14
  • [8] RelativeNAS: Relative Neural Architecture Search via Slow-Fast Learning
    Tan, Hao
    Cheng, Ran
    Huang, Shihua
    He, Cheng
    Qiu, Changxiao
    Yang, Fan
    Luo, Ping
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) : 475 - 489
  • [9] Neural architecture search for energy-efficient always-on audio machine learning
    Daniel T. Speckhard
    Karolis Misiunas
    Sagi Perel
    Tenghui Zhu
    Simon Carlile
    Malcolm Slaney
    [J]. Neural Computing and Applications, 2023, 35 : 12133 - 12144
  • [10] Neural architecture search for energy-efficient always-on audio machine learning
    Speckhard, Daniel T.
    Misiunas, Karolis
    Perel, Sagi
    Zhu, Tenghui
    Carlile, Simon
    Slaney, Malcolm
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (16): : 12133 - 12144