HyConv: Accelerating Multi-Phase CNN Computation by Fine-Grained Policy Selection

被引：5

作者：

Li, Xiaqing ^{[1
]}

Zhang, Guangyan ^{[1
,2
]}

Wang, Zhufan ^{[1
]}

Zheng, Weimin ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China

[2] Jilin Univ, Minist Educ, Key Lab Symbol Computat & Knowledge Engn, Changchun 130012, Jilin, Peoples R China

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2019年 / 30卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Convolution policy; convolutional neural network; deep learning; general-purpose GPU; parallel computing;

D O I：

10.1109/TPDS.2018.2864299

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Existing GPU-based approaches cannot yet meet the performance requirement for training very large convolutional neural networks (CNNs), where convolutional layers (Conv-layers) dominate the training time. In this paper, we find that no single convolution policy can always perform the fastest across all the computing phases. Then, we propose an approach called HyConv to accelerating multi-phase CNN computation by fine-grained policy selection. HyConv encapsulates existing convolution policies into a set of modules, and selects the fastest policy (a.k.a., winner policy) via one-round runtime measurement for computing each phase. Furthermore, HyConv uses a winner database to record the current winner policies, avoiding duplicate measurement later for the same parameter configuration. Our experimental results indicate that over all the used real-world CNN networks, HyConv consistently outperforms existing approaches on either a single GPU or four GPUs, with speedups of up to 3.3x and up to 1.6x over cuDNN-MM respectively. Such improvement can be explained by our result that HyConv delivers obviously better performance for most of single Conv-layers. Furthermore, HyConv has the ability to work with any parameter configuration and thus keeps better usability.

引用

页码：388 / 399

页数：12

共 50 条

[41] Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition
Wang, Yaming
Morariu, Vlad I.
Davis, Larry S.
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4148 - 4157
[42] A novel CNN structure for fine-grained classification of Chinese calligraphy styles
Zhang, Jiulong
Guo, Mingtao
Fan, Jianping
INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2019, 22 (02) : 177 - 188
[43] Two-Stream Contextualized CNN for Fine-Grained Image Classification
Liu, Jiang
Gao, Chenqiang
Meng, Deyu
Zuo, Wangmeng
THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 4232 - 4233
[44] CNN-Transformer with Stepped Distillation for Fine-Grained Visual Classification
Xu, Qin
Liu, Peng
Wang, Jiahui
Huang, Lili
Tang, Jin
PATTERN RECOGNITION AND COMPUTER VISION, PT IX, PRCV 2024, 2025, 15039 : 364 - 377
[45] Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating
Hua, Weizhe
Zhou, Yuan
De Sa, Christopher
Zhang, Zhiru
Suh, G. Edward
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 139 - 150
[46] Hybrid ViT-CNN Network for Fine-Grained Image Classification
Shao, Ran
Bi, Xiao-Jun
Chen, Zheng
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1109 - 1113
[47] FINE-GRAINED MULTI-CLASS OBJECT COUNTING
Go, Hyojun
Byun, Junyoung
Park, Byeongjun
Choi, Myung-Ae
Yoo, Seunghwa
Kim, Changick
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 509 - 513
[48] Fine-grained scheduling in multi-resource clusters
Mosong Zhou
Xiaoshe Dong
Heng Chen
Xingjun Zhang
The Journal of Supercomputing, 2020, 76 : 1931 - 1958
[49] Fine-grained scheduling in multi-resource clusters
Zhou, Mosong
Dong, Xiaoshe
Chen, Heng
Zhang, Xingjun
JOURNAL OF SUPERCOMPUTING, 2020, 76 (03): : 1931 - 1958
[50] Multi Fine-Grained Fusion Network for Depression Detection
Zhou, Li
Liu, Zhenyu
Li, Yutong
Duan, Yuchi
Yu, Huimin
Hu, Bin
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (08)

← 1 2 3 4 5 →