Performance Evaluation of Deep Learning Compilers for Edge Inference

被引:21
|
作者
Verma, Gaurav [1 ]
Gupta, Yashi [1 ]
Malik, Abid M. [2 ]
Chapman, Barbara [1 ,2 ]
机构
[1] SUNY Stony Brook, Stony Brook, NY 11794 USA
[2] Brookhaven Natl Lab, Upton, NY 11973 USA
关键词
TensorFlow-TensorRT; TensorFlow Lite; Compilers for DL; Inference at Edge; NEURAL-NETWORK INFERENCE;
D O I
10.1109/IPDPSW52791.2021.00128
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, edge computing has received considerable attention as a promising means to provide Deep Learning (DL) based services. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) in edge devices, using the devices' limited resources efficiently is a challenge that affects deep learning-based analysis services. This has led to the development of several inference compilers such as TensorRT, TensorFlow Lite, Relay, and TVM, which optimize DL inference models specifically for edge devices. These compilers operate on the standard DL models available for inferencing in various frameworks, e.g., PyTorch, TensorFlow, Caffe, PaddlePaddle, and transform them into a corresponding lightweight model. TensorFlow Lite and TensorRT are considered state-of-the-art inference compilers and encompass most of the compiler optimization techniques that have been proposed for edge computing. This paper presents a detailed performance study of TensorFlow Lite (TFLite) and TensorFlow TensorRT (TF-TRT) using commonly employed DL models for edge devices on varying hardware platforms. The work compares throughput, latency performance, and power consumption. We find that the integrated TF-TRT consistently performs better at the high precision floating point on different DL architectures, especially with GPUs using tensor cores. However, it loses its edge for model compression to TFLite at low precision. TFLite which is primarily designed for mobile applications, performs better with lightweight DL models than the deep neural network-based models. It is the first detailed performance comparison of TF-TRT and TFLite inference compilers to the best of our knowledge.
引用
收藏
页码:858 / 865
页数:8
相关论文
共 50 条
  • [1] Evaluating the Performance of Deep Learning Inference Service on Edge Platform
    Choi, Hyun-Hwa
    Cha, Jae-Geun
    Yun, Seung-Hyun
    Kim, Dae Won
    Jang, Sumin
    Kim, Sun Wook
    [J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1789 - 1793
  • [2] Pre-trained Lightweight Deep Learning Models for Surgical Instrument Detection: Performance Evaluation for Edge Inference
    Ahmed, Md Sabbir
    Giordano, Stefano
    [J]. IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 3873 - 3878
  • [3] Orpheus: A New Deep Learning Framework for Easy Deployment and Evaluation of Edge Inference
    Gibson, Perry
    Cano, Jose
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), 2020, : 229 - 230
  • [4] Deep Learning Inference at the Edge for Mobile and Aerial Robotics
    Faniadis, Efstathios
    Amanatiadis, Angelos
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR 2020), 2020, : 334 - 340
  • [5] The Case for Hierarchical Deep Learning Inference at the Network Edge
    Al-Atat, Ghina
    Fresa, Andrea
    Behera, Adarsh Prasad
    Moothedath, Vishnu Narayanan
    Gross, James
    Champati, Jaya Prakash
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON NETWORKED AI SYSTEMS, NETAISYS 2023, 2023, : 13 - 18
  • [6] Multimodal Deep Learning with Boosted Trees for Edge Inference
    Chong, Penny
    Wynter, Laura
    Chaudhury, Bharathi
    [J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 99 - 108
  • [7] A Deep Learning Approach to Sensor Fusion Inference at the Edge
    Becnel, T.
    Gaillardon, P-E
    [J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1420 - 1425
  • [8] Performance and Efficiency Evaluation of ASR Inference on the Edge
    Gondi, Santosh
    Pratap, Vineel
    [J]. SUSTAINABILITY, 2021, 13 (22)
  • [9] Fuzzing Deep Learning Compilers with HIRGEN
    Ma, Haoyang
    Shen, Qingchao
    Tian, Yongqiang
    Chen, Junjie
    Cheung, Shing-Chi
    [J]. PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, 2023, : 248 - 260
  • [10] A high-performance dataflow-centric optimization framework for deep learning inference on the edge
    Zhang, Runhua
    Jiang, Hongxu
    Geng, Jinkun
    Tian, Fangzheng
    Ma, Yuhang
    Wang, Haojie
    [J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 152