QLP: Deep Q-Learning for Pruning Deep Neural Networks

被引:10
|
作者
Camci, Efe [1 ]
Gupta, Manas [1 ]
Wu, Min [1 ]
Lin, Jie [1 ]
机构
[1] ASTAR, Inst Infocomm Res I2R, Singapore 138632, Singapore
关键词
Training; Neural networks; Indexes; Computer architecture; Deep learning; Biological neural networks; Task analysis; Deep neural network compression; pruning; deep reinforcement learning; MODEL COMPRESSION; SPARSITY;
D O I
10.1109/TCSVT.2022.3167951
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We present a novel, deep Q-learning based method, QLP, for pruning deep neural networks (DNNs). Given a DNN, our method intelligently determines favorable layer-wise sparsity ratios, which are then implemented via unstructured, magnitude-based, weight pruning. In contrast to previous reinforcement learning (RL) based pruning methods, our method is not forced to prune a DNN within a single, sequential pass from the first layer to the last. It visits each layer multiple times and prunes them little by little at each visit, achieving superior granular pruning. Moreover, our method is not restricted to a subset of actions within the feasible action space. It has the flexibility to execute a whole range of sparsity ratios (0% - 100%) for each layer. This enables aggressive pruning without compromising accuracy. Furthermore, our method does not require a complex state definition; it features a simple, generic definition that is composed of only the index and the density of the layers, which leads to less computational demand while observing the state at each interaction. Lastly, our method utilizes a carefully designed curriculum that enables learning targeted policies for each sparsity regime, which helps to deliver better accuracy, especially at high sparsity levels. We conduct batched performance tests at compelling sparsity levels (up to 98%), present extensive ablation studies to justify our RL-related design choices, and compare our method with the state-of-the-art, including RL-based and other pruning methods. Our method sets the new state-of-the-art results in most of the experiments with ResNet-32 and ResNet-56 over CIFAR-10 dataset as well as ResNet-50 and MobileNet-v1 over ILSVRC2012 (ImageNet) dataset.
引用
收藏
页码:6488 / 6501
页数:14
相关论文
共 50 条
  • [31] Deep Q-learning with hybrid quantum neural network on solving maze problems
    Chen, Hao-Yuan
    Chang, Yen-Jui
    Liao, Shih-Wei
    Chang, Ching-Ray
    QUANTUM MACHINE INTELLIGENCE, 2024, 6 (01)
  • [32] Double Deep Q-Learning Based Channel Estimation for Industrial Wireless Networks
    Bhardwaj, Sanjay
    Lee, Jae-Min
    Kim, Dong-Seong
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1318 - 1320
  • [33] Deep Q-Learning for Chunk-based Caching in Data Processing Networks
    Wang, Yimeng
    Li, Yongbo
    Lan, Tian
    Aggarwalt, Vaneet
    2019 57TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2019, : 910 - 916
  • [34] Deep Q-learning based resource allocation in industrial wireless networks for URLLC
    Bhardwaj, Sanjay
    Ginanjar, Rizki Rivai
    Kim, Dong-Seong
    IET COMMUNICATIONS, 2020, 14 (06) : 1022 - 1027
  • [35] Hierarchical Deep Q-Learning Based Handover in Wireless Networks with Dual Connectivity
    Iturria-Rivera, Pedro Enrique
    Elsayed, Medhat
    Bavand, Majid
    Gaigalas, Raimundas
    Furr, Steve
    Erol-Kantarci, Melike
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 6553 - 6558
  • [36] Distributed Caching Popular Services by Using Deep Q-Learning in Converged Networks
    Fang, Yuzhe
    Xiong, Jian
    Cheng, Peng
    Zhang, Wei
    2019 IEEE 90TH VEHICULAR TECHNOLOGY CONFERENCE (VTC2019-FALL), 2019,
  • [37] Maximizing Opinion Polarization Using Double Deep Q-Learning in Social Networks
    Zareer, Mohamed N.
    Selmic, Rastko R.
    IEEE ACCESS, 2025, 13 : 57398 - 57412
  • [38] Deep Cross-Check Q-Learning for Jamming Mitigation in Wireless Networks
    Elleuch, Ibrahim
    Pourranjbar, Ali
    Kaddoum, Georges
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (05) : 1448 - 1452
  • [39] Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
    Hoefler, Torsten
    Alistarh, Dan
    Ben-Nun, Tal
    Dryden, Nikoli
    Peste, Alexandra
    Journal of Machine Learning Research, 2021, 22
  • [40] Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care
    Shirali, Ali
    Schubert, Alexander
    Alaa, Ahmed
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (10) : 6268 - 6279