In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

被引:172
|
作者
Bulo, Samuel Rota [1 ]
Porzi, Lorenzo [1 ]
Kontschieder, Peter [1 ]
机构
[1] Mapillary Res, Graz, Austria
关键词
D O I
10.1109/CVPR.2018.00591
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work we present In-Place Activated Batch Normalization (INPLACE-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straightforward applicability for existing deep learning frameworks. We obtain memory savings of up to 50% by dropping intermediate results and by recovering required information during the backward pass through the inversion of stored forward results, with only minor increase (0.8-2%) in computation time. Also, we demonstrate how frequently used checkpointing approaches can be made computationally as efficient as INPLACE-ABN. In our experiments on image classification, we demonstrate on-par results on ImageNet-1k with state-of-the-art approaches. On the memory-demanding task of semantic segmentation, we report competitive results for COCO-Stuff and set new state-of-the-art results for Cityscapes and Mapillary Vistas. Code can be found at https://github.com/mapillary/inplace_abn.
引用
收藏
页码:5639 / 5647
页数:9
相关论文
共 50 条
  • [1] Memory-Optimized Wavefront Parallelism on GPUs
    Yuanzhe Li
    Loren Schwiebert
    [J]. International Journal of Parallel Programming, 2020, 48 : 1008 - 1031
  • [2] Memory-Optimized Wavefront Parallelism on GPUs
    Li, Yuanzhe
    Schwiebert, Loren
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2020, 48 (06) : 1008 - 1031
  • [3] In-place Memory Mapping Approach for Optimized Parallel Hardware Interleaver Architectures
    Reehman, Saeed Ur
    Chavet, Cyrille
    Coussy, Philippe
    Sani, Awais
    [J]. 2015 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2015, : 896 - 899
  • [4] Realizing Memory-Optimized Distributed Graph Processing
    Liakos, Panagiotis
    Papakonstantinopoulou, Katia
    Delis, Alex
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (04) : 743 - 756
  • [5] IN-PLACE REGENERATION OF ACTIVATED CARBON
    HIMMELSTEIN, KJ
    FOX, RD
    WINTER, TH
    [J]. CHEMICAL ENGINEERING PROGRESS, 1973, 69 (11) : 65 - 69
  • [6] Memory-optimized distributed utility mining for big data
    Kumar, Sunil
    Mohbey, Krishna Kumar
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 6491 - 6503
  • [7] A Memory-optimized Bloom Filter using An Additional Hashing Function
    Ahmadi, Mahmood
    Wong, Stephan
    [J]. GLOBECOM 2008 - 2008 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, 2008,
  • [8] ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads
    Kim, Kangnyeon
    Wang, Tianzheng
    Johnson, Ryan
    Pandis, Ippokratis
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1675 - 1687
  • [9] Memory-Optimized Tile Based Data Structure for Adaptive Mesh Refinement
    Ivanov, Anton
    Perepelkina, Anastasia
    Levchenko, Vadim
    Pershin, Ilya
    [J]. SUPERCOMPUTING (RUSCDAYS 2019), 2019, 1129 : 64 - 74
  • [10] Memory-Optimized Distributed Graph Processing through Novel Compression Techniques
    Liakos, Panagiotis
    Papakonstantinopoulou, Katia
    Delis, Alex
    [J]. CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 2317 - 2322