Performance Evaluation of Deep Learning Compilers for Edge Inference

被引：21

作者：

Verma, Gaurav ^{[1
]}

Gupta, Yashi ^{[1
]}

Malik, Abid M. ^{[2
]}

Chapman, Barbara ^{[1
,2
]}

机构：

[1] SUNY Stony Brook, Stony Brook, NY 11794 USA

[2] Brookhaven Natl Lab, Upton, NY 11973 USA

来源：

2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) | 2021年

关键词：

TensorFlow-TensorRT; TensorFlow Lite; Compilers for DL; Inference at Edge; NEURAL-NETWORK INFERENCE;

D O I：

10.1109/IPDPSW52791.2021.00128

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, edge computing has received considerable attention as a promising means to provide Deep Learning (DL) based services. However, due to the limited computation capability of the data processing units (such as CPUs, GPUs, and specialized accelerators) in edge devices, using the devices' limited resources efficiently is a challenge that affects deep learning-based analysis services. This has led to the development of several inference compilers such as TensorRT, TensorFlow Lite, Relay, and TVM, which optimize DL inference models specifically for edge devices. These compilers operate on the standard DL models available for inferencing in various frameworks, e.g., PyTorch, TensorFlow, Caffe, PaddlePaddle, and transform them into a corresponding lightweight model. TensorFlow Lite and TensorRT are considered state-of-the-art inference compilers and encompass most of the compiler optimization techniques that have been proposed for edge computing. This paper presents a detailed performance study of TensorFlow Lite (TFLite) and TensorFlow TensorRT (TF-TRT) using commonly employed DL models for edge devices on varying hardware platforms. The work compares throughput, latency performance, and power consumption. We find that the integrated TF-TRT consistently performs better at the high precision floating point on different DL architectures, especially with GPUs using tensor cores. However, it loses its edge for model compression to TFLite at low precision. TFLite which is primarily designed for mobile applications, performs better with lightweight DL models than the deep neural network-based models. It is the first detailed performance comparison of TF-TRT and TFLite inference compilers to the best of our knowledge.

引用

页码：858 / 865

页数：8

共 50 条

[1] Evaluating the Performance of Deep Learning Inference Service on Edge Platform
Choi, Hyun-Hwa
Cha, Jae-Geun
Yun, Seung-Hyun
Kim, Dae Won
Jang, Sumin
Kim, Sun Wook
[J]. 12TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE (ICTC 2021): BEYOND THE PANDEMIC ERA WITH ICT CONVERGENCE INNOVATION, 2021, : 1789 - 1793
[2] Pre-trained Lightweight Deep Learning Models for Surgical Instrument Detection: Performance Evaluation for Edge Inference
Ahmed, Md Sabbir
Giordano, Stefano
[J]. IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 3873 - 3878
[3] Orpheus: A New Deep Learning Framework for Easy Deployment and Evaluation of Edge Inference
Gibson, Perry
Cano, Jose
[J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS), 2020, : 229 - 230
[4] Deep Learning Inference at the Edge for Mobile and Aerial Robotics
Faniadis, Efstathios
Amanatiadis, Angelos
[J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON SAFETY, SECURITY, AND RESCUE ROBOTICS (SSRR 2020), 2020, : 334 - 340
[5] The Case for Hierarchical Deep Learning Inference at the Network Edge
Al-Atat, Ghina
Fresa, Andrea
Behera, Adarsh Prasad
Moothedath, Vishnu Narayanan
Gross, James
Champati, Jaya Prakash
[J]. PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON NETWORKED AI SYSTEMS, NETAISYS 2023, 2023, : 13 - 18
[6] Multimodal Deep Learning with Boosted Trees for Edge Inference
Chong, Penny
Wynter, Laura
Chaudhury, Bharathi
[J]. 2023 23RD IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW 2023, 2023, : 99 - 108
[7] A Deep Learning Approach to Sensor Fusion Inference at the Edge
Becnel, T.
Gaillardon, P-E
[J]. PROCEEDINGS OF THE 2021 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2021), 2021, : 1420 - 1425
[8] Performance and Efficiency Evaluation of ASR Inference on the Edge
Gondi, Santosh
Pratap, Vineel
[J]. SUSTAINABILITY, 2021, 13 (22)
[9] Fuzzing Deep Learning Compilers with HIRGEN
Ma, Haoyang
Shen, Qingchao
Tian, Yongqiang
Chen, Junjie
Cheung, Shing-Chi
[J]. PROCEEDINGS OF THE 32ND ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2023, 2023, : 248 - 260
[10] A high-performance dataflow-centric optimization framework for deep learning inference on the edge
Zhang, Runhua
Jiang, Hongxu
Geng, Jinkun
Tian, Fangzheng
Ma, Yuhang
Wang, Haojie
[J]. JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 152

← 1 2 3 4 5 →