FLNA: An Energy-Efficient Point Cloud Feature Learning Accelerator with Dataflow Decoupling

被引:1
|
作者
Lyu, Dongxu [1 ]
Li, Zhenyu [1 ]
Chen, Yuzhou [1 ]
Xu, Ningyi [1 ,3 ]
He, Guanghui [1 ,2 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, AI Inst, MoE, Key Lab Artificial Intelligence, Shanghai, Peoples R China
[3] Huixi Technol, Chongqing, Peoples R China
基金
美国国家科学基金会;
关键词
Point Cloud; Feature Learning Accelerator; Algorithm-architecture Co-design; Sparsity Exploitation;
D O I
10.1109/DAC56929.2023.10247674
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Grid-based feature learning network plays a key role in recent point-cloud based 3D perception. However, high point sparsity and special operators lead to large memory footprint and long processing latency, posing great challenges to hardware acceleration. We propose FLNA, a novel feature learning accelerator with algorithm-architecture co-design. At algorithm level, the dataflow-decoupled graph is adopted to reduce 86% computation by exploiting inherent sparsity and concat redundancy. At hardware design level, we customize a pipelined architecture with block-wise processing, and introduce transposed SRAM strategy to save 82.1% access power. Implemented on a 40nm technology, FLNA achieves 13.4 - 43.3x speedup over RTX 2080Ti GPU. It rivals the state-of-the-art accelerator by 1.21x energy-efficiency improvement with 50.8% latency reduction.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Designing an Energy-Efficient Cloud Network [Invited]
    Kantarci, Burak
    Mouftah, Hussein T.
    JOURNAL OF OPTICAL COMMUNICATIONS AND NETWORKING, 2012, 4 (11) : B101 - B113
  • [42] An Energy-Efficient VM Placement in Cloud Datacenter
    Teng, Fei
    Deng, Danting
    Yu, Lei
    Magoules, Frederic
    2014 IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS, 2014 IEEE 6TH INTL SYMP ON CYBERSPACE SAFETY AND SECURITY, 2014 IEEE 11TH INTL CONF ON EMBEDDED SOFTWARE AND SYST (HPCC,CSS,ICESS), 2014, : 173 - 180
  • [43] Energy-Efficient Cloud Computing for Smart Phones
    Arya, Nancy
    Chaudhary, Sunita
    Taruna, S.
    EMERGING TRENDS IN EXPERT APPLICATIONS AND SECURITY, 2019, 841 : 111 - 115
  • [44] Energy-Efficient Scheduling for Cloud Mobile Gaming
    Care, Riccardo
    Hassan, Hussein Al Haj
    Suarez, Luis
    Nuaymi, Loutfi
    2014 GLOBECOM WORKSHOPS (GC WKSHPS), 2014, : 1198 - 1204
  • [45] A 1.15 TOPS/W Energy-Efficient Capsule Network Accelerator for Real-Time 3D Point Cloud Segmentation in Mobile Environment
    Park, Gwangtae
    Im, Dongseok
    Han, Donghyeon
    Yoo, Hoi-Jun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2020, 67 (09) : 1594 - 1598
  • [46] E2HRL: An Energy-efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning
    Shiri, Aidin
    Kallakuri, Uttej
    Rashid, Hasib-Al
    Prakash, Bharat
    Waytowich, Nicholas R.
    Oates, Tim
    Mohsenin, Tinoosh
    ACM TRANSACTIONS ON DESIGN AUTOMATION OF ELECTRONIC SYSTEMS, 2022, 27 (05)
  • [47] SONIC: A Sparse Neural Network Inference Accelerator with Silicon Photonics for Energy-Efficient Deep Learning
    Sunny, Febin
    Nikdast, Mandi
    Pasricha, Sudeep
    27TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2022, 2022, : 214 - 219
  • [48] Energy-Efficient Dataflow Scheduling of CNN Applications for Vector-SIMD DSP
    Kim, Wontae
    Lee, Sangheon
    Yun, Ilwi
    Lee, Chulhee
    Lee, Kyujoong
    Lee, Hyuk-Jae
    IEEE ACCESS, 2022, 10 : 86234 - 86247
  • [49] Reinforcement learning based methodology for energy-efficient resource allocation in cloud data centers
    Thein, Thandar
    Myo, Myint Myat
    Parvin, Sazia
    Gawanmeh, Amjad
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (10) : 1127 - 1139
  • [50] Energy-Efficient Virtual Machines Consolidation in Cloud Data Centers using Reinforcement Learning
    Farahnakian, Fahimeh
    Liljeberg, Pasi
    Plosila, Juha
    2014 22ND EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2014), 2014, : 500 - 507