Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters

被引:2
|
作者
Borowiec, Damian [1 ]
Yeung, Gingfung [1 ]
Friday, Adrian [1 ]
Harper, Richard H. R. [1 ]
Garraghan, Peter [1 ]
机构
[1] Univ Lancaster, Lancaster, England
基金
英国工程与自然科学研究理事会;
关键词
Deep Learning; Cloud datacenter; MLaaS; Machine Learning systems; Energy; Sustainable AI;
D O I
10.1109/CLOUD55607.2022.00061
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud datacenters capable of provisioning high performance Machine Learning-as-a-Service (MLaaS) at reduced resource cost is achieved via auto-tuning" automated tensor program optimization of Deep Learning models to minimize inference latency within a hardware device. However given the extensive heterogeneity of Deep Learning models, libraries, and hardware devices, performing auto-tuning within Cloud datacenters incurs a significant time, compute resource, and energy cost of which state-of-the-art auto-tuning is not designed to mitigate. In this paper we propose Trimmer, a high performance and cost-efficient Deep Learning auto-tuning framework for Cloud datacenters. Trimmer maximizes DL model performance and tensor program cost-efficiency by preempting tensor program implementations exhibiting poor optimization improvement; and applying an ML-based filtering method to replace expensive low performing tensor programs to provide greater likelihood of selecting low latency tensor programs. Through an empirical study exploring the cost of DL model optimization techniques, our analysis indicates that 26-43% of total energy is expended on measuring tensor program implementations that do not positively contribute towards auto-tuning. Experiment results show that Trimmer achieves high auto-tuning cost-efficiency across different DL models, and reduces auto-tuning energy use by 21.8-40.9% for Cloud clusters whilst achieving DL model latency equivalent to state-of-the-art techniques.
引用
收藏
页码:374 / 384
页数:11
相关论文
共 50 条
  • [1] DeepCAT: A Cost-Efficient Online Configuration Auto-Tuning Approach for Big Data Frameworks
    Dou, Hui
    Wang, Yilun
    Zhang, Yiwen
    Chen, Pengfei
    51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [2] Cost-efficient quality SMD trimmer capacitors
    Anon
    Microwave Journal, 1995, 38 (04)
  • [3] TurBO: A cost-efficient configuration-based auto-tuning approach for cluster-based big data frameworks
    Dou, Hui
    Zhang, Lei
    Zhang, Yiwen
    Chen, Pengfei
    Zheng, Zibin
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2023, 177 : 89 - 105
  • [4] Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF)
    Rasch, Ari
    Schulze, Richard
    Steuwer, Michel
    Gorlatch, Sergei
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (01)
  • [5] Preconditioner auto-tuning using deep learning for sparse iterative algorithms
    Yamada, Kenya
    Katagiri, Takahiro
    Takizawa, Hiroyuki
    Kazuo, Minami
    Yokokawa, Mitsuo
    Nagai, Toru
    Ogino, Masao
    2018 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING WORKSHOPS (CANDARW 2018), 2018, : 257 - 262
  • [6] Mjolnir: A framework agnostic auto-tuning system with deep reinforcement learning
    Nourchene Ben Slimane
    Houssem Sagaama
    Maher Marwani
    Sabri Skhiri
    Applied Intelligence, 2023, 53 : 14008 - 14022
  • [7] Threshold Auto-Tuning Metric Learning
    Rivero, Rachelle
    Onuma, Yuya
    Kato, Tsuyoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (06) : 1163 - 1170
  • [8] Mjolnir: A framework agnostic auto-tuning system with deep reinforcement learning
    Ben Slimane, Nourchene
    Sagaama, Houssem
    Marwani, Maher
    Skhiri, Sabri
    APPLIED INTELLIGENCE, 2023, 53 (11) : 14008 - 14022
  • [9] Cost-Efficient Overclocking in Immersion-Cooled Datacenters
    Jalili, Majid
    Manousakis, Ioannis
    Goiri, Inigo
    Misra, Pulkit A.
    Raniwala, Ashish
    Alissa, Husam
    Ramakrishnan, Bharath
    Tuma, Phillip
    Belady, Christian
    Fontoura, Marcus
    Bianchini, Ricardo
    2021 ACM/IEEE 48TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA 2021), 2021, : 623 - 636
  • [10] A Cost-Efficient QoS-Aware Model for Cloud IaaS Resource Allocation in Large Datacenters
    Metwally, Khaled
    Jarray, Abdallah
    Karmouch, Ahmed
    2015 IEEE 4TH INTERNATIONAL CONFERENCE ON CLOUD NETWORKING (CLOUDNET), 2015, : 38 - 43