MiniMalloc: A Lightweight Memory Allocator for Hardware-Accelerated Machine Learning

被引:1
|
作者
Moffitt, Michael D. [1 ]
机构
[1] Google, Mountain View, CA 94043 USA
关键词
memory allocation; hardware acceleration; machine learning; ARCHITECTURE; PACKING;
D O I
10.1145/3623278.3624752
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a new approach to static memory allocation, a key problem that arises in the compilation of machine learning models onto the resources of a specialized hardware accelerator. Our methodology involves a recursive depth-first search that limits exploration to a special class of canonical solutions, dramatically reducing the size of the search space. We also develop a spatial inference technique that exploits this special structure by pruning unpromising partial assignments and backtracking more effectively than otherwise possible. Finally, we introduce a new mechanism capable of detecting and eliminating dominated solutions from consideration. Empirical results demonstrate orders of magnitude improvement in performance as compared to the previous state-of-the-art on many benchmarks, as well as a substantial reduction in library size.
引用
收藏
页码:238 / 252
页数:15
相关论文
共 50 条
  • [31] Hardware-accelerated dynamic light field rendering
    Goldlücke, B
    Magnor, M
    Wilburn, B
    VISION MODELING, AND VISUALIZATION 2002, PROCEEDINGS, 2002, : 455 - +
  • [32] Human Recognition with a Hardware-Accelerated Multi-Prototype Learning and Classification System
    Wicaksono, Indra Bagus
    An, Fengwei
    Mattausch, Hans Juergen
    2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2012), 2012,
  • [33] A generic hardware-accelerated OFDM system simulator
    Veiverys, Antanas
    Goluguri, Vara Prasad
    Le Moullec, Yannick
    Rom, Christian
    Olsen, Ole
    Koch, Peter
    NORCHIP 2005, PROCEEDINGS, 2005, : 62 - 65
  • [34] PHAST: Hardware-accelerated shortest path trees
    Delling, Daniel
    Goldberg, Andrew V.
    Nowatzyk, Andreas
    Werneck, Renato F.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2013, 73 (07) : 940 - 952
  • [35] Hardware-Accelerated Index Construction for Semantic Web
    Blochwitz, Christopher
    Wolff, Julian
    Berekovic, Mladen
    Heinrich, Dennis
    Groppe, Sven
    Joseph, Jan Moritz
    Pionteck, Thilo
    2018 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT 2018), 2018, : 281 - 284
  • [36] Hardware-accelerated visual hull reconstruction and rendering
    Li, M
    Magnor, M
    Seidel, HP
    GRAPHICS INTERFACE 2003, PROCEEDING, 2003, : 65 - 71
  • [37] Hardware-accelerated adaptive EWA volume splatting
    Chen, W
    Ren, L
    Zwicker, M
    Pfister, H
    IEEE VISUALIZATION 2004, PROCEEEDINGS, 2004, : 67 - 74
  • [38] Hardware-Accelerated Cache Simulation for Multicore by FPGA
    Hung, Shih-Hao
    Ho, Yi-Mo
    Yeh, Chih-Wei
    Liu, Cheng-Yueh
    Lee, Chen-Pang
    PROCEEDINGS OF THE 2018 CONFERENCE ON RESEARCH IN ADAPTIVE AND CONVERGENT SYSTEMS (RACS 2018), 2018, : 231 - 236
  • [39] Protean: ADAPTIVE HARDWARE-ACCELERATED INTERMITTENT COMPUTING
    Bakar, Abu
    Goel, Rishabh
    de Winkel, Jasper
    Huang, Jason
    Ahmed, Saad
    Islam, Bashima
    Pawelczak, Przemyslaw
    Yildirim, Kasim Sinan
    Hester, Josiah
    GETMOBILE-MOBILE COMPUTING & COMMUNICATIONS REVIEW, 2023, 27 (01) : 5 - 10
  • [40] Transform coding for hardware-accelerated volume rendering
    Fout, Nathaniel
    Ma, Kwan-Liu
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2007, 13 (06) : 1600 - 1607