In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

被引：172

作者：

Bulo, Samuel Rota ^{[1
]}

Porzi, Lorenzo ^{[1
]}

Kontschieder, Peter ^{[1
]}

机构：

[1] Mapillary Res, Graz, Austria

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

D O I：

10.1109/CVPR.2018.00591

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work we present In-Place Activated Batch Normalization (INPLACE-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50% by dropping intermediate results and by recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2%) in computation time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as INPLACE-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report competitive results for COCO-Stuff and set new state-of-the-art results for Cityscapes and Mapillary Vistas. Code can be found at https://github.com/mapillary/inplace_abn.

引用

页码：5639 / 5647

页数：9

共 50 条

[1] Memory-Optimized Wavefront Parallelism on GPUs
Yuanzhe Li
Loren Schwiebert
[J]. International Journal of Parallel Programming, 2020, 48 : 1008 - 1031
[2] Memory-Optimized Wavefront Parallelism on GPUs
Li, Yuanzhe
Schwiebert, Loren
[J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2020, 48 (06) : 1008 - 1031
[3] In-place Memory Mapping Approach for Optimized Parallel Hardware Interleaver Architectures
Reehman, Saeed Ur
Chavet, Cyrille
Coussy, Philippe
Sani, Awais
[J]. 2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2015, : 896 - 899
[4] Realizing Memory-Optimized Distributed Graph Processing
Liakos, Panagiotis
Papakonstantinopoulou, Katia
Delis, Alex
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (04) : 743 - 756
[5] IN-PLACE REGENERATION OF ACTIVATED CARBON
HIMMELSTEIN, KJ
FOX, RD
WINTER, TH
[J]. CHEMICAL ENGINEERING PROGRESS, 1973, 69 (11) : 65 - 69
[6] Memory-optimized distributed utility mining for big data
Kumar, Sunil
Mohbey, Krishna Kumar
[J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6491 - 6503
[7] A Memory-optimized Bloom Filter using An Additional Hashing Function
Ahmadi, Mahmood
Wong, Stephan
[J]. GLOBECOM 2008 - 2008 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, 2008,
[8] ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads
Kim, Kangnyeon
Wang, Tianzheng
Johnson, Ryan
Pandis, Ippokratis
[J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1675 - 1687
[9] Memory-Optimized Tile Based Data Structure for Adaptive Mesh Refinement
Ivanov, Anton
Perepelkina, Anastasia
Levchenko, Vadim
Pershin, Ilya
[J]. SUPERCOMPUTING (RUSCDAYS 2019), 2019, 1129 : 64 - 74
[10] Memory-Optimized Distributed Graph Processing through Novel Compression Techniques
Liakos, Panagiotis
Papakonstantinopoulou, Katia
Delis, Alex
[J]. CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 2317 - 2322

← 1 2 3 4 5 →