Mayo: A Framework for Auto-generating Hardware Friendly Deep Neural Networks

被引:7
|
作者
Zhao, Yiren [1 ]
Gao, Xitong [2 ]
Mullins, Robert [1 ]
Xu, Chengzhong [2 ]
机构
[1] Univ Cambridge, Cambridge, England
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen, Peoples R China
基金
英国工程与自然科学研究理事会; 中国国家自然科学基金;
关键词
Deep Neural Network; Pruning; Qantization; Automated Hyperparameter Optimization;
D O I
10.1145/3212725.3212726
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep Neural Networks (DNNs) have proved to be a convenient and powerful tool for a wide range of problems. However, the extensive computational and memory resource requirements hinder the adoption of DNNs in resource-constrained scenarios. Existing compression methods have been shown to significantly reduce the computation and memory requirements of many popular DNNs. These methods, however, remain elusive to non-experts, as they demand extensive manual tuning of hyperparameters. The effects of combining various compression techniques lack exploration because of the large design space. To alleviate these challenges, this paper proposes an automated framework, Mayo, which is built on top of TensorFlow and can compress DNNs with minimal human intervention. First, we present overriders which are recursively-compositional and can be configured to effectively compress individual components (e.g. weights, biases, layer computations and gradients) in a DNN. Second, we introduce novel heuristics and a global search algorithm to effciently optimize hyperparameters. We demonstrate that without any manual tuning, Mayo generates a sparse ResNet-18 that is 5.13x smaller than the baseline with no loss in test accuracy. By composing multiple overriders, our tool produces a sparse 6-bit CIFAR-10 classifier with only 0.16% top-1 accuracy loss and a 34x compression rate. Mayo and all compressed models are publicly available. To our knowledge, Mayo is the first framework that supports overlapping multiple compression techniques and automatically optimizes hyperparameters in them.
引用
收藏
页码:25 / 30
页数:6
相关论文
共 50 条
  • [41] VLSI-Friendly Filtering Algorithms for Deep Neural Networks
    Cariow, Aleksandr
    Paplinski, Janusz P.
    Makowska, Marta
    APPLIED SCIENCES-BASEL, 2023, 13 (15):
  • [42] Generating Fake but Realistic Headlines Using Deep Neural Networks
    Dandekar, Ashish
    Zen, Remmy A. M.
    Bressan, Stephane
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2017, PT II, 2017, 10439 : 427 - 440
  • [43] Towards Debugging Deep Neural Networks by Generating Speech Utterances
    Soomro, Bilal
    Kanervisto, Anssi
    Trung Ngo Trong
    Hautamaki, Ville
    INTERSPEECH 2019, 2019, : 3213 - 3217
  • [44] HFPQ: deep neural network compression by hardware-friendly pruning-quantization
    YingBo Fan
    Wei Pang
    ShengLi Lu
    Applied Intelligence, 2021, 51 : 7016 - 7028
  • [45] HFPQ: deep neural network compression by hardware-friendly pruning-quantization
    Fan, YingBo
    Pang, Wei
    Lu, ShengLi
    APPLIED INTELLIGENCE, 2021, 51 (10) : 7016 - 7028
  • [46] Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
    Jagtap, Ameya D.
    Shin, Yeonjong
    Kawaguchi, Kenji
    Karniadakis, George Em
    NEUROCOMPUTING, 2022, 468 (165-180) : 165 - 180
  • [47] Efficient and hardware-friendly methods to implement competitive learning for spiking neural networks
    Lianhua Qu
    Zhenyu Zhao
    Lei Wang
    Yong Wang
    Neural Computing and Applications, 2020, 32 : 13479 - 13490
  • [48] Efficient and hardware-friendly methods to implement competitive learning for spiking neural networks
    Qu, Lianhua
    Zhao, Zhenyu
    Wang, Lei
    Wang, Yong
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (17): : 13479 - 13490
  • [49] A probabilistic framework for mutation testing in deep neural networks
    Tambon, Florian
    Khomh, Foutse
    Antoniol, Giuliano
    INFORMATION AND SOFTWARE TECHNOLOGY, 2023, 155
  • [50] Optimus: An Operator Fusion Framework for Deep Neural Networks
    Cai, Xuyi
    Wang, Ying
    Zhang, Lei
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (01)