A High-Performance FPGA Accelerator for CUR Decomposition

被引：1

作者：

Abdelgawad, M. A. A. ^{[1
]}

Cheung, Ray C. C. ^{[1
]}

Yan, Hong ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China

来源：

2022 32ND INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL | 2022年

关键词：

CUR decomposition; low-rank decomposition; high level synthesis; SVD and QR decomposition;

D O I：

10.1109/FPL57034.2022.00052

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

A matrix factorization is to decompose a matrix into a product of smaller matrices. It is widely used in machine learning algorithms. There are many matrix decomposition algorithms, and each has various applications. CUR matrix decomposition is a widely-used factorization tool that has been employed for dimension reduction and pattern recognition in many scientific and engineering applications, such as image processing, text mining, and wireless communications. In this paper we propose an efficient FPGA-based floating-point accelerator using high-level synthesis (HLS) for the CUR decomposition algorithm. Our experiment results demonstrate the better efficiency of our hardware design compared to the optimized CPU-based software solutions. The speedup of our FPGA-based architecture over the optimized software implementation ranges from 2.37 to 16.82 times for different dimensions of the data input matrix. We evaluated our design using large dimension matrices 1024x1024 and 2048 x 2048 and the experiment results demonstrated the efficiency of our design in terms of the utilized resources and latency. Finally, we have compared our design with other matrix decomposition algorithms such as SVD and QR decomposition, the experiment results demonstrated that CUR is more efficient than SVD and QR decomposition in terms of latency and required resources.

引用

页码：294 / 299

页数：6

共 50 条

[1] High-Performance FPGA Accelerator for SIKE
El Khatib, Rami
Azarderakhsh, Reza
Mozaffari-Kermani, Mehran
IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (06) : 1237 - 1248
[2] A High-performance FPGA-based Accelerator for Gradient Compression
Ren, Qingqing
Zhu, Shuyong
Meng, Xuying
Zhang, Yujun
DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 429 - 438
[3] High-Performance of Eigenvalue Decomposition on FPGA for the DOA Estimation
Zhang, Xiao-Wei
Yan, Di
Zuo, Lei
Li, Ming
Guo, Jian-Xin
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5782 - 5797
[4] High-Performance FPGA-based Accelerator for Bayesian Neural Networks
Fan, Hongxiang
Ferianc, Martin
Rodrigues, Miguel
Zhou, Hongyu
Niu, Xinyu
Luk, Wayne
2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1063 - 1068
[5] A High-Performance FPGA-Based Depthwise Separable Convolution Accelerator
Huang, Jiye
Liu, Xin
Guo, Tongdong
Zhao, Zhijin
ELECTRONICS, 2023, 12 (07)
[6] ADD: Accelerator Design and Deploy - A tool for FPGA high-performance dataflow computing
Penha, Jeronimo C.
Silva, Lucas B.
Silva, Jansen M.
Coelho, Kristtopher K.
Baranda, Hector P.
Nacif, Jose Augusto M.
Ferreira, Ricardo S.
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (18):
[7] High-Performance Mixed-Low-Precision CNN Inference Accelerator on FPGA
Wang, Junbin
Fang, Shaoxia
Wang, Xi
Ma, Jiangsha
Wang, Taobo
Shan, Yi
IEEE MICRO, 2021, 41 (04) : 31 - 38
[8] Work-in-Progress: A High-performance FPGA Accelerator for Sparse Neural Networks
Lu, Yuntao
Gong, Lei
Xu, Chongchong
Sun, Fan
Zhang, Yiwei
Wang, Chao
Zhou, Xuehai
2017 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES), 2017,
[9] FPGA-Based High-Performance Data Compression Deep Neural Network Accelerator
Wang, Hanze
Fu, Yingxun
Ma, Li
2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 563 - 569
[10] FPGA-based hardware accelerator for high-performance data-stream processing
Lysakov K.F.
Shadrin M.Y.
Pattern Recognition and Image Analysis, 2013, 23 (1) : 26 - 34

← 1 2 3 4 5 →