Efficient Resource-Aware Convolutional Neural Architecture Search for Edge Computing with Pareto-Bayesian Optimization

被引:0
|
作者
Yang, Zhao [1 ]
Zhang, Shengbing [1 ]
Li, Ruxu [1 ]
Li, Chuxi [1 ]
Wang, Miao [1 ]
Wang, Danghui [1 ]
Zhang, Meng [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
edge computing; neural architecture search; latency profiling model; Pareto-Bayesian optimization;
D O I
10.3390/s21020444
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
With the development of deep learning technologies and edge computing, the combination of them can make artificial intelligence ubiquitous. Due to the constrained computation resources of the edge device, the research in the field of on-device deep learning not only focuses on the model accuracy but also on the model efficiency, for example, inference latency. There are many attempts to optimize the existing deep learning models for the purpose of deploying them on the edge devices that meet specific application requirements while maintaining high accuracy. Such work not only requires professional knowledge but also needs a lot of experiments, which limits the customization of neural networks for varied devices and application scenarios. In order to reduce the human intervention in designing and optimizing the neural network structure, multi-objective neural architecture search methods that can automatically search for neural networks featured with high accuracy and can satisfy certain hardware performance requirements are proposed. However, the current methods commonly set accuracy and inference latency as the performance indicator during the search process, and sample numerous network structures to obtain the required neural network. Lacking regulation to the search direction with the search objectives will generate a large number of useless networks during the search process, which influences the search efficiency to a great extent. Therefore, in this paper, an efficient resource-aware search method is proposed. Firstly, the network inference consumption profiling model for any specific device is established, and it can help us directly obtain the resource consumption of each operation in the network structure and the inference latency of the entire sampled network. Next, on the basis of the Bayesian search, a resource-aware Pareto Bayesian search is proposed. Accuracy and inference latency are set as the constraints to regulate the search direction. With a clearer search direction, the overall search efficiency will be improved. Furthermore, cell-based structure and lightweight operation are applied to optimize the search space for further enhancing the search efficiency. The experimental results demonstrate that with our method, the inference latency of the searched network structure reduced 94.71% without scarifying the accuracy. At the same time, the search efficiency increased by 18.18%.
引用
收藏
页码:1 / 20
页数:20
相关论文
共 50 条
  • [1] Efficient resource-aware convolutional neural architecture search for edge computing with Pareto-Bayesian optimization
    Yang, Zhao
    Zhang, Shengbing
    Li, Ruxu
    Li, Chuxi
    Wang, Miao
    Wang, Danghui
    Zhang, Meng
    [J]. Sensors (Switzerland), 2021, 21 (02): : 1 - 20
  • [2] Efficient Resource-Aware Neural Architecture Search with a Neuro-Symbolic Approach
    Bellodi, Elena
    Bertozzi, Davide
    Bizzarri, Alice
    Favalli, Michele
    Fraccaroli, Michele
    Zese, Riccardo
    [J]. 2023 IEEE 16TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANY-CORE SYSTEMS-ON-CHIP, MCSOC, 2023, : 171 - 178
  • [3] Efficient Resource-aware Neural Architecture Search with Dynamic Adaptive Network Sampling
    Yang, Zhao
    Sun, Qingshuang
    [J]. 2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [4] Resource-Aware Workload Orchestration for Edge Computing
    Babirye, Susan
    Serugunda, Jonathan
    Okello, Dorothy
    Mwanje, Stephen
    [J]. 2020 28TH TELECOMMUNICATIONS FORUM (TELFOR), 2020, : 117 - 120
  • [5] Resource-Aware Feature Extraction in Mobile Edge Computing
    Ding, Chuntao
    Zhou, Ao
    Liu, Xiulong
    Ma, Xiao
    Wang, Shangguang
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (01) : 321 - 331
  • [6] IoT Resource-aware Orchestration Framework for Edge Computing
    Agrawal, Niket
    Rellermeyer, Jan
    Ding, Aaron Yi
    [J]. CONEXT'19 COMPANION: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON EMERGING NETWORKING EXPERIMENTS AND TECHNOLOGIES, 2019, : 62 - 64
  • [7] MCEA: A Resource-Aware Multicore CGRA Architecture for the Edge
    Korol, Guilherme
    Jordan, Michael Guilherme
    Brandalero, Marcelo
    Huebner, Michael
    Rutzig, Mateus Beck
    Schneider Beck, Antonio Carlos
    [J]. 2020 30TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2020, : 33 - 39
  • [8] RAPDARTS: Resource-Aware Progressive Differentiable Architecture Search
    Green, Sam
    Vineyard, Craig M.
    Helinski, Ryan
    Koc, Cetin Kaya
    [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [9] MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search
    Cioflan, Cristian
    Timofte, Radu
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14353 - 14359
  • [10] RAMOS: A Resource-Aware Multi-Objective System for Edge Computing
    Gedawy, Hend
    Habak, Karim
    Harras, Khaled A.
    Hamdi, Mounir
    [J]. IEEE TRANSACTIONS ON MOBILE COMPUTING, 2021, 20 (08) : 2654 - 2670