Training Compact DNNs with l1/2 Regularization

被引：3

作者：

Tang, Anda ^{[1
]}

Niu, Lingfeng ^{[2
,3
]}

Miao, Jianyu ^{[4
]}

Zhang, Peng ^{[5
]}

机构：

[1] Univ Chinese Acad Sci, Sch Math Sci, Beijing 100190, Peoples R China

[2] Chinese Acad Sci, Res Ctr Fictitious Econ & Data Sci, Beijing 100190, Peoples R China

[3] Univ Chinese Acad Sci, Sch Econ & Management, Beijing 100190, Peoples R China

[4] Henan Univ Technol, Sch Artificial Intelligence & Big Data, Zhengzhou 450001, Peoples R China

[5] Guangzhou Univ, Cyberspace Inst Adv Technol, Guangzhou 511442, Peoples R China

来源：

PATTERN RECOGNITION | 2023年 / 136卷

基金：

中国国家自然科学基金;

关键词：

Deep neural networks; Model compression; 1; 2; Quasi-norm; Non-Lipschitz regularization; Sparse optimization; L-1/2; REGULARIZATION; VARIABLE SELECTION; NEURAL-NETWORKS; REPRESENTATION; MINIMIZATION; DROPOUT; MODEL;

D O I：

10.1016/j.patcog.2022.109206

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural network(DNN) has achieved unprecedented success in many fields. However, its large model parameters which bring a great burden on storage and calculation hinder the development and appli-cation of DNNs. It is worthy of compressing the model to reduce the complexity of the DNN. Sparsity -inducing regularizer is one of the most common tools for compression. In this paper, we propose utilizing the pound 1 / 2 quasi-norm to zero out weights of neural networks and compressing the networks automatically during the learning process. To our knowledge, it is the first work applying the non-Lipschitz contin-uous regularizer for the compression of DNNs. The resulting sparse optimization problem is solved by stochastic proximal gradient algorithm. For further convenience of calculation, an approximation of the threshold-form solution to the proximal operator with pound 1 / 2 is given at the same time. Extensive experi-ments with various datasets and baselines demonstrate the advantages of our new method.(c) 2022 Elsevier Ltd. All rights reserved.

引用

页数：12

共 50 条

[11] Relating lp regularization and reweighted l1 regularization
Wang, Hao
Zeng, Hao
Wang, Jiashan
Wu, Qiong
OPTIMIZATION LETTERS, 2021, 15 (08) : 2639 - 2660
[12] Sparse SAR imaging based on L1/2 regularization
ZENG JinShan
Science China(Information Sciences), 2012, 55 (08) : 1755 - 1775
[13] Sparse SAR imaging based on L1/2 regularization
JinShan Zeng
Jian Fang
ZongBen Xu
Science China Information Sciences, 2012, 55 : 1755 - 1775
[14] A CT Reconstruction Algorithm Based on L1/2 Regularization
Chen, Mianyi
Mi, Deling
He, Peng
Deng, Luzhen
Wei, Biao
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2014, 2014
[15] NONCONVEX L1/2 REGULARIZATION FOR SPARSE PORTFOLIO SELECTION
Xu, Fengmin
Wang, Guan
Gao, Yuelin
PACIFIC JOURNAL OF OPTIMIZATION, 2014, 10 (01): : 163 - 176
[16] Collaborative Spectrum Sensing via L1/2 Regularization
Liu, Zhe
Li, Feng
Duan, WenLei
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2015, E98A (01): : 445 - 449
[17] Hyperspectral Unmixing Based on Weighted L1/2 Regularization
Li, Yan
Li, Kai
2016 3RD INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2016, : 400 - 404
[18] An l2/l1 regularization framework for diverse learning tasks
Wang, Shengzheng
Peng, Jing
Liu, Wei
SIGNAL PROCESSING, 2015, 109 : 206 - 211
[19] DENSITY MATRIX MINIMIZATION WITH l1 REGULARIZATION
Lai, Rongjie
Lu, Jianfeng
Osher, Stanley
COMMUNICATIONS IN MATHEMATICAL SCIENCES, 2015, 13 (08) : 2097 - 2117
[20] Robust point matching by l1 regularization
Yi, Jianbing
Li, Yan-Ran
Yang, Xuan
He, Tiancheng
Chen, Guoliang
PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 369 - 374

← 1 2 3 4 5 →