Multihardware Adaptive Latency Prediction for Neural Architecture Search

被引：0

作者：

Lin, Chengmin ^{[1
]}

Yang, Pengfei ^{[1
]}

Wang, Quan ^{[1
]}

Guo, Yitong ^{[1
]}

Wang, Zhenyi ^{[1
]}

机构：

[1] Xidian Univ, Sch Comp Sci & Technol, Xian 710126, Peoples R China

来源：

IEEE INTERNET OF THINGS JOURNAL | 2025年 / 12卷 / 03期

关键词：

Hardware; Predictive models; Adaptation models; Training; Accuracy; Network architecture; Computer architecture; Optimization; Performance evaluation; Data models; Dynamic sample allocation; few-shot learning; hardware-aware; latency predictor; neural architecture search (NAS); representative sample sampling; NETWORKS;

D O I：

10.1109/JIOT.2024.3480990

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In hardware-aware neural architecture search (NAS), accurately assessing a model's inference efficiency is crucial for search optimization. Traditional approaches, which measure numerous samples to train proxy models, are impractical across varied platforms due to the extensive resources needed to remeasure and rebuild models for each platform. To address this challenge, we propose a multihardware-aware NAS method that enhances the generalizability of proxy models across different platforms while reducing the required sample size. Our method introduces a multihardware adaptive latency prediction (MHLP) model that leverages one-hot encoding for hardware parameters and multihead attention mechanisms to effectively capture the intricate interplay between hardware attributes and network architecture features. Additionally, we implement a two-stage sampling mechanism based on probability density weighting to ensure the representativeness and diversity of the sample set. By adopting a dynamic sample allocation mechanism, our method can adjust the adaptive sample size according to the initial model state, providing stronger data support for devices with significant deviations. Evaluations on NAS benchmarks demonstrate the MHLP predictor's excellent generalization accuracy using only 10 samples, guiding the NAS search process to identify optimal network architectures.

引用

页码：3385 / 3398

页数：14

共 50 条

[1] Reducing energy consumption of Neural Architecture Search: An inference latency prediction framework
Lu, Longfei
Lyu, Bo
SUSTAINABLE CITIES AND SOCIETY, 2021, 67
[2] Prediction of dMRI signals with neural architecture search
Chen, Haoze
Zhang, Zhijie
Jin, Mingwu
Wang, Fengxiang
JOURNAL OF NEUROSCIENCE METHODS, 2022, 365
[3] TrajectoryNAS: A Neural Architecture Search for Trajectory Prediction
Sharifi, Ali Asghar
Zoljodi, Ali
Daneshtalab, Masoud
SENSORS, 2024, 24 (17)
[4] LATENCY-CONTROLLED NEURAL ARCHITECTURE SEARCH FOR STREAMING SPEECH RECOGNITION
He, Liqiang
Feng, Shulin
Su, Dan
Yu, Dong
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 62 - 67
[5] STOCHASTIC ADAPTIVE NEURAL ARCHITECTURE SEARCH FOR KEYWORD SPOTTING
Veniat, Tom
Schwander, Olivier
Denoyer, Ludovic
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2842 - 2846
[6] Neural architecture search via similarity adaptive guidance
Xue, Yu
Zha, Jiajie
Wahib, Mohamed
Ouyang, Tinghui
Wang, Xiao
APPLIED SOFT COMPUTING, 2024, 162
[7] EDANAS: Adaptive Neural Architecture Search for Early Exit Neural Networks
Gambella, Matteo
Roveri, Manuel
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[8] Evolutionary neural architecture search on transformers for RUL prediction
Mo, Hyunho
Iacca, Giovanni
MATERIALS AND MANUFACTURING PROCESSES, 2023, 38 (15) : 1881 - 1898
[9] Encodings for Prediction-based Neural Architecture Search
Akhauri, Yash
Abdelfattah, Mohamed S.
Proceedings of Machine Learning Research, 2024, 235 : 740 - 759
[10] DGL: Device Generic Latency Model for Neural Architecture Search on Mobile Devices
Wang, Qinsi
Zhang, Sihai
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (02) : 1954 - 1967

← 1 2 3 4 5 →