Tetris: A Heuristic Static Memory Management Framework for Uniform Memory Multicore Neural Network Accelerators

被引:0
|
作者
Xiao-Bing Chen
Hao Qi
Shao-Hui Peng
Yi-Min Zhuang
Tian Zhi
Yun-Ji Chen
机构
[1] State Key Laboratory of Computer Architecture,
[2] Institute of Computing Technology,undefined
[3] Chinese Academy of Sciences,undefined
[4] University of Chinese Academy of Sciences,undefined
[5] School of Computer Science and Technology,undefined
[6] University of Science and Technology of China,undefined
[7] Chinese Academy of Sciences Center for Excellence in Brain Science and Intelligence Technology,undefined
关键词
multicore neural network accelerator; liveness analysis; static memory management; memory reuse; genetic algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Uniform memory multicore neural network accelerators (UNNAs) furnish huge computing power to emerging neural network applications. Meanwhile, with neural network architectures going deeper and wider, the limited memory capacity has become a constraint to deploy models on UNNA platforms. Therefore how to efficiently manage memory space and how to reduce workload footprints are urgently significant. In this paper, we propose Tetris: a heuristic static memory management framework for UNNA platforms. Tetris reconstructs execution flows and synchronization relationships among cores to analyze each tensor's liveness interval. Then the memory management problem is converted to a sequence per- mutation problem. Tetris uses a genetic algorithm to explore the permutation space to optimize the memory management strategy and reduce memory footprints. We evaluate several typical neural networks and the experimental results demonstrate that Tetris outperforms the state-of-the-art memory allocation methods, and achieves an average memory reduction ratio of 91.9% and 87.9% for a quad-core and a 16-core Cambricon-X platform, respectively.
引用
收藏
页码:1255 / 1270
页数:15
相关论文
共 50 条
  • [31] Software Thermal Management of DRAM Memory for Multicore Systems
    Lin, Jiang
    Zheng, Hongzhong
    Zhu, Zhichun
    Gorbatov, Eugene
    David, Howard
    Zhang, Zhao
    SIGMETRICS'08: PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON MEASUREMENT & MODELING OF COMPUTER SYSTEMS, 2008, 36 (01): : 337 - +
  • [32] Deep Neural Network Memory Performance and Throughput Modeling and Simulation Framework
    Gabbay, Freddy
    Aharoni, Rotem Lev
    Schweitzer, Ori
    MATHEMATICS, 2022, 10 (21)
  • [33] Automated optimization for memory-efficient high-performance deep neural network accelerators
    Kim, HyunMi
    Lyuh, Chun-Gi
    Kwon, Youngsu
    ETRI JOURNAL, 2020, 42 (04) : 505 - 517
  • [34] Extreme Partial-Sum Quantization for Analog Computing-In-Memory Neural Network Accelerators
    Kim, Yulhwa
    Kim, Hyungjun
    Kim, Jae-Joon
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (04)
  • [35] On-Chip Memory Technology Design Space Explorations for Mobile Deep Neural Network Accelerators
    Li, Haitong
    Bhargava, Mudit
    Whatmough, Paul N.
    Wong, H-S Philip
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [36] Memory of fuzzy neural network
    Xing, J.S.
    An, K.
    Wan, B.W.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2001, 35 (02): : 171 - 174
  • [37] A Novel Heuristic Neuron Grouping Algorithm for Deep Neural Network Accelerators
    Cakin, Alperen
    Dilek, Selma
    Tosun, Suleyman
    Nacar, Furkan
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2025,
  • [38] A duplication heuristic for static scheduling of tasks on distributed memory multiprocessors
    Chung, YC
    Liu, CC
    Liu, JS
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 1995, 18 (06) : 845 - 855
  • [39] Optimizing Memory Management in Deeply Heterogeneous HPC Accelerators
    Pupykina, Anna
    Agosta, Giovanni
    2017 46TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS (ICPPW), 2017, : 291 - 300
  • [40] GraphMMU: Memory Management Unit for Sparse Graph Accelerators
    Kapre, Nachiket
    Han Jianglei
    Bean, Andrew
    Moorthy, Pradeep
    Siddhartha
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 113 - 120