Tetris: A Heuristic Static Memory Management Framework for Uniform Memory Multicore Neural Network Accelerators

被引:0
|
作者
Xiao-Bing Chen
Hao Qi
Shao-Hui Peng
Yi-Min Zhuang
Tian Zhi
Yun-Ji Chen
机构
[1] State Key Laboratory of Computer Architecture,
[2] Institute of Computing Technology,undefined
[3] Chinese Academy of Sciences,undefined
[4] University of Chinese Academy of Sciences,undefined
[5] School of Computer Science and Technology,undefined
[6] University of Science and Technology of China,undefined
[7] Chinese Academy of Sciences Center for Excellence in Brain Science and Intelligence Technology,undefined
关键词
multicore neural network accelerator; liveness analysis; static memory management; memory reuse; genetic algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Uniform memory multicore neural network accelerators (UNNAs) furnish huge computing power to emerging neural network applications. Meanwhile, with neural network architectures going deeper and wider, the limited memory capacity has become a constraint to deploy models on UNNA platforms. Therefore how to efficiently manage memory space and how to reduce workload footprints are urgently significant. In this paper, we propose Tetris: a heuristic static memory management framework for UNNA platforms. Tetris reconstructs execution flows and synchronization relationships among cores to analyze each tensor's liveness interval. Then the memory management problem is converted to a sequence per- mutation problem. Tetris uses a genetic algorithm to explore the permutation space to optimize the memory management strategy and reduce memory footprints. We evaluate several typical neural networks and the experimental results demonstrate that Tetris outperforms the state-of-the-art memory allocation methods, and achieves an average memory reduction ratio of 91.9% and 87.9% for a quad-core and a 16-core Cambricon-X platform, respectively.
引用
收藏
页码:1255 / 1270
页数:15
相关论文
共 50 条
  • [1] Tetris: A Heuristic Static Memory Management Framework for Uniform Memory Multicore Neural Network Accelerators
    Chen, Xiao-Bing
    Qi, Hao
    Peng, Shao-Hui
    Zhuang, Yi-Min
    Zhi, Tian
    Chen, Yun-Ji
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2022, 37 (06) : 1255 - 1270
  • [2] Memory Trojan Attack on Neural Network Accelerators
    Zhao, Yang
    Hu, Xing
    Li, Shuangchen
    Ye, Jing
    Deng, Lei
    Ji, Yu
    Xu, Jianyu
    Wu, Dong
    Xie, Yuan
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 1415 - 1420
  • [3] Polyhedral-Based Compilation Framework for In-Memory Neural Network Accelerators
    Han, Jianhui
    Fei, Xiang
    Li, Zhaolin
    Zhang, Youhui
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (01)
  • [4] Small Memory Footprint Neural Network Accelerators
    Seto, Kenshu
    Nejatollahi, Hamid
    An, Jiyoung
    Kang, Sujin
    Dutt, Nikil
    PROCEEDINGS OF THE 2019 20TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED), 2019, : 253 - 258
  • [5] Space-address decoupled scratchpad memory management for neural network accelerators
    Zhang, Zhenxing
    Sun, Shiyan
    Chen, Xunyu
    Zhi, Tian
    Guo, Qi
    Chen, Yunji
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (06):
  • [6] Memory Requirements for Convolutional Neural Network Hardware Accelerators
    Siu, Kevin
    Stuart, Dylan Malone
    Mahmoud, Mostafa
    Moshovos, Andreas
    2018 IEEE INTERNATIONAL SYMPOSIUM ON WORKLOAD CHARACTERIZATION (IISWC), 2018, : 111 - 121
  • [7] Improving Memory Utilization in Convolutional Neural Network Accelerators
    Jokic, Petar
    Emery, Stephane
    Benini, Luca
    IEEE EMBEDDED SYSTEMS LETTERS, 2021, 13 (03) : 77 - 80
  • [8] A Survey on Memory Subsystems for Deep Neural Network Accelerators
    Asad, Arghavan
    Kaur, Rupinder
    Mohammadi, Farah
    FUTURE INTERNET, 2022, 14 (05):
  • [9] TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory
    Gao, Mingyu
    Pu, Jing
    Yang, Xuan
    Horowitz, Mark
    Kozyrakis, Christos
    TWENTY-SECOND INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS (ASPLOS XXII), 2017, : 751 - 764
  • [10] TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory
    Gao, Mingyu
    Pu, Jing
    Yang, Xuan
    Horowitz, Mark
    Kozyrakis, Christos
    OPERATING SYSTEMS REVIEW, 2017, 51 (02) : 751 - 764