TSCompiler: efficient compilation framework for dynamic-shape models

被引:0
|
作者
Luo, Xiang [1 ]
Zhang, Chen [2 ]
Geng, Chenbo [3 ]
Yi, Yanzhi [4 ]
Hu, Jiahui [4 ]
Zhang, Renwei [4 ]
Zhang, Zhen [5 ]
Consolaro, Gianpietro [5 ]
Yang, Fan [6 ]
Lu, Tun [1 ]
Gu, Ning [1 ]
Shang, Li [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200433, Peoples R China
[2] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai 200240, Peoples R China
[3] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
[4] Huawei Technol Co Ltd, Beijing 100095, Peoples R China
[5] Huawei Paris Res Ctr, F-92100 Paris, France
[6] Fudan Univ, Sch Microelect, Shanghai 201203, Peoples R China
基金
中国国家自然科学基金;
关键词
machine learning; tensor compilers; dynamic shape; operator fusion; code generation; autotuning;
D O I
10.1007/s11432-024-4071-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Today's deep learning models face an increasing demand to handle dynamic shape tensors and computation whose shape information remains unknown at compile time and varies in a nearly infinite range at runtime. This shape dynamism brings tremendous challenges for existing compilation pipelines designed for static models which optimize tensor programs relying on exact shape values. This paper presents TSCompiler, an end-to-end compilation framework for dynamic shape models. TSCompiler first proposes a symbolic shape propagation algorithm to recover symbolic shape information at compile time to enable subsequent optimizations. TSCompiler then partitions the shape-annotated computation graph into multiple subgraphs and fine-tunes the backbone operators from the subgraph within a hardware-aligned search space to find a collection of high-performance schedules. TSCompiler can propagate the explored backbone schedule to other fusion groups within the same subgraph to generate a set of parameterized tensor programs for fused cases based on dependence analysis. At runtime, TSCompiler utilizes an occupancy-targeted cost model to select from pre-compiled tensor programs for varied tensor shapes. Extensive evaluations show that TSCompiler can achieve state-of-the-art speedups for dynamic shape models. For example, we can improve kernel efficiency by up to 3.97x on NVIDIA RTX3090, and 10.30 x on NVIDIA A100 and achieve up to five orders of magnitude speedups on end-to-end latency.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] TSCompiler: efficient compilation framework for dynamic-shape models
    Xiang LUO
    Chen ZHANG
    Chenbo GENG
    Yanzhi YI
    Jiahui HU
    Renwei ZHANG
    Zhen ZHANG
    Gianpietro CONSOLARO
    Fan YANG
    Tun LU
    Ning GU
    Li SHANG
    [J]. Science China(Information Sciences)., 2024, 67 (10) - 84
  • [2] A dynamic compilation framework for controlling microprocessor energy and performance
    Wu, Q
    Reddi, VJ
    Wu, YF
    Lee, J
    Connors, D
    Brooks, D
    Martonosi, M
    Clark, DW
    [J]. MICRO-38: PROCEEDINGS OF THE 38TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUMN ON MICROARCHITECTURE, 2005, : 271 - 282
  • [3] Compilation of unified physical models for efficient sound synthesis
    Karjalainen, M
    Erkut, C
    Savioja, L
    [J]. 2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL V, PROCEEDINGS: SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO AND ELECTROACOUSTICS MULTIMEDIA SIGNAL PROCESSING, 2003, : 433 - 436
  • [4] Efficient Compilation Techniques for Large Scale Feature Models
    Mendonca, Marcilio
    Wasowski, Andrzej
    Czarnecki, Krzysztof
    Cowan, Donald
    [J]. GPCE'08: PROCEEDINGS OF THE ACM SIGPLAN SEVENTH INTERNATIONAL CONFERENCE ON GENERATIVE PROGRAMMING AND COMPONENT ENGINEERING, 2008, : 13 - 21
  • [5] Establishing Operational Models for Dynamic Compilation in a Simulation Platform
    Nghi Quang Huynh
    Tram Huynh Vo
    Hiep Xuan Huynh
    Alexis Drogoul
    [J]. NATURE OF COMPUTATION AND COMMUNICATION, 2015, 144 : 117 - 131
  • [6] Design, implementation, and evaluation of a dynamic compilation framework for the YAP system
    da Silva, Anderson Faustino
    Costa, Vitor Santos
    [J]. LOGIC PROGRAMMING, PROCEEDINGS, 2007, 4670 : 410 - +
  • [7] Dynamic shape and appearance models
    Doretto, Gianfranco
    Soatto, Stefano
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (12) : 2006 - 2019
  • [8] A Polyhedral Compilation Framework for Loops with Dynamic Data-Dependent Bounds
    Zhao, Jie
    Kruse, Michael
    Cohen, Albert
    [J]. CC'18: PROCEEDINGS OF THE 27TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION, 2018, : 14 - 24
  • [9] A Common Framework for Compilation Techniques Applied to Diagnosis of Linear Dynamic Systems
    Bregon, Anibal
    Biswas, Gautam
    Pulido, Belarmino
    Alonso-Gonzalez, Carlos
    Khorasgani, Hamed
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2014, 44 (07): : 863 - 876
  • [10] Dynamic compilation framework with DVS for reducing energy consumption in embedded processors
    Shi, Qingsong
    Chen, Tianzhou
    Liang, Xiao
    Huang, Jiangwei
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS, 2008, : 464 - 470