FLNA: An Energy-Efficient Point Cloud Feature Learning Accelerator with Dataflow Decoupling

被引：1

作者：

Lyu, Dongxu ^{[1
]}

Li, Zhenyu ^{[1
]}

Chen, Yuzhou ^{[1
]}

Xu, Ningyi ^{[1
,3
]}

He, Guanghui ^{[1
,2
,3
]}

机构：

[1] Shanghai Jiao Tong Univ, Sch Elect Informat & Elect Engn, Shanghai, Peoples R China

[2] Shanghai Jiao Tong Univ, AI Inst, MoE, Key Lab Artificial Intelligence, Shanghai, Peoples R China

[3] Huixi Technol, Chongqing, Peoples R China

来源：

2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC | 2023年

基金：

美国国家科学基金会;

关键词：

Point Cloud; Feature Learning Accelerator; Algorithm-architecture Co-design; Sparsity Exploitation;

D O I：

10.1109/DAC56929.2023.10247674

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Grid-based feature learning network plays a key role in recent point-cloud based 3D perception. However, high point sparsity and special operators lead to large memory footprint and long processing latency, posing great challenges to hardware acceleration. We propose FLNA, a novel feature learning accelerator with algorithm-architecture co-design. At algorithm level, the dataflow-decoupled graph is adopted to reduce 86% computation by exploiting inherent sparsity and concat redundancy. At hardware design level, we customize a pipelined architecture with block-wise processing, and introduce transposed SRAM strategy to save 82.1% access power. Implemented on a 40nm technology, FLNA achieves 13.4 - 43.3x speedup over RTX 2080Ti GPU. It rivals the state-of-the-art accelerator by 1.21x energy-efficiency improvement with 50.8% latency reduction.

引用

页数：6

共 50 条

[1] FLNA: Flexibly Accelerating Feature Learning Networks for Large-Scale Point Clouds With Efficient Dataflow Decoupling
Lyu, Dongxu
Li, Zhenyu
Chen, Yuzhou
Wang, Gang
He, Weifeng
Xu, Ningyi
He, Guanghui
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (04) : 739 - 751
[2] EWS: An Energy-Efficient CNN Accelerator With Enhanced Weight Stationary Dataflow
Wang, Chengxuan
Wang, Zongsheng
Li, Shuaiting
Zhang, Yuanming
Shen, Haibin
Huang, Kejie
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (07) : 3478 - 3482
[3] An Energy-Efficient Accelerator Architecture with Serial Accumulation Dataflow for Deep CNNs
Ahmadi, Mehdi
Vakili, Shervin
Langlois, J. M. Pierre
2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 214 - 217
[4] Point-to-Spike Residual Learning for Energy-Efficient 3D Point Cloud Classification
Wu, Qiaoyun
Zhang, Quanxiao
Tan, Chunyu
Zhou, Yun
Sun, Changyin
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6092 - 6099
[5] An Energy-Efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning
Shiri, Aidin
Prakash, Bharat
Mazumder, Arnab Neelim
Waytowich, Nicholas R.
Oates, Tim
Mohsenin, Tinoosh
2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
[6] Energy and Bandwidth Efficient Sparse Programmable Dataflow Accelerator
Schneider, Felix
Karagounis, Michael
Choubey, Bhaskar
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (09) : 4092 - 4105
[7] An Efficient FPGA Accelerator for Point Cloud
Wang, Zilun
Mao, Wendong
Yang, Peixiang
Wang, Zhongfeng
Lin, Jun
2022 IEEE 35TH INTERNATIONAL SYSTEM-ON-CHIP CONFERENCE (IEEE SOCC 2022), 2022, : 310 - 315
[8] PointAcc: Efficient Point Cloud Accelerator
Lin, Yujun
Zhang, Zhekai
Tang, Haotian
Wang, Hanrui
Han, Song
PROCEEDINGS OF 54TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, MICRO 2021, 2021, : 449 - 461
[9] Exploration of Energy-Efficient Architecture for Graph-Based Point-Cloud Deep Learning
Zhang, Jie-Fang
Zhang, Zhengya
2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 260 - 264
[10] An Energy-Efficient Programmable Mixed Signal Accelerator for Machine Learning Algorithms
Kang, Mingu
Srivastava, Prakalp
Adve, Vikram
Kim, Nam Sung
Shanbhag, Naresh R.
IEEE MICRO, 2019, 39 (05) : 64 - 72

← 1 2 3 4 5 →