MiniMalloc: A Lightweight Memory Allocator for Hardware-Accelerated Machine Learning

被引:1
|
作者
Moffitt, Michael D. [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
关键词
memory allocation; hardware acceleration; machine learning; ARCHITECTURE; PACKING;
D O I
10.1145/3623278.3624752
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a new approach to static memory allocation, a key problem that arises in the compilation of machine learning models onto the resources of a specialized hardware accelerator. Our methodology involves a recursive depth-first search that limits exploration to a special class of canonical solutions, dramatically reducing the size of the search space. We also develop a spatial inference technique that exploits this special structure by pruning unpromising partial assignments and backtracking more effectively than otherwise possible. Finally, we introduce a new mechanism capable of detecting and eliminating dominated solutions from consideration. Empirical results demonstrate orders of magnitude improvement in performance as compared to the previous state-of-the-art on many benchmarks, as well as a substantial reduction in library size.
引用
收藏
页码:238 / 252
页数:15
相关论文
共 50 条
  • [1] TPUPoint: Automatic Characterization of Hardware-Accelerated Machine-Learning Behavior for Cloud Computing
    Wudenhe, Abenezer
    Tseng, Hung-Wei
    2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021), 2021, : 254 - 264
  • [2] Reduced Memory Viterbi Decoding for Hardware-accelerated Speech Recognition
    Raj, Pani Prithvi
    Reddy, Pakala Akhil
    Chandrachoodan, Nitin
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2022, 21 (03)
  • [3] Hardware-accelerated template matching
    Cabido, R
    Montemayor, AS
    Sánchez, A
    PATTERN RECOGNITION AND IMAGE ANALYSIS, PT 1, PROCEEDINGS, 2005, 3522 : 691 - 698
  • [4] Hardware-accelerated Text Analytics
    Polig, R.
    Atasu, K.
    Hagleitner, C.
    Chiticariu, L.
    Reiss, F.
    Zhu, H.
    Hofstee, P.
    2014 IEEE HOT CHIPS 26 SYMPOSIUM (HCS), 2014,
  • [5] Hardware-accelerated simulated radiography
    Laney, D
    Callahan, SP
    Max, N
    Silva, CT
    Langer, S
    Frank, R
    IEEE VISUALIZATION 2005, PROCEEDINGS, 2005, : 343 - 350
  • [6] EvoJAX: Hardware-Accelerated Neuroevolution
    Tang, Yujin
    Tian, Yingtao
    Ha, David
    GECCO 2022 Companion - Proceedings of the 2022 Genetic and Evolutionary Computation Conference, 2022, : 308 - 311
  • [7] EvoJAX: Hardware-Accelerated Neuroevolution
    Tang, Yujin
    Tian, Yingtao
    Ha, David
    PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 308 - 311
  • [8] Pgx: Hardware-Accelerated Parallel Game Simulators for Reinforcement Learning
    Koyamada, Sotetsu
    Okano, Shinri
    Nishimori, Soichiro
    Murata, Yu
    Habara, Keigo
    Kita, Haruka
    Ishii, Shin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Hardware implementation of a memory allocator
    Jasrotia, K
    Zhu, JW
    EUROMICRO SYMPOSIUM ON DIGITAL SYSTEM DESIGN, PROCEEDINGS: ARCHITECTURES, METHODS AND TOOLS, 2002, : 355 - 358
  • [10] Fault-memory handling for hardware-accelerated concurrent fault simulation
    Hahn, W
    Hagerer, A
    Huber, D
    Wehner, M
    PROCEEDINGS OF THE 1998 SUMMER COMPUTER SIMULATION CONFERENCE: SIMULATION AND MODELING TECHNOLOGY FOR THE TWENTY-FIRST CENTURY, 1998, : 269 - 277