A High-Performance FPGA Accelerator for CUR Decomposition

被引:1
|
作者
Abdelgawad, M. A. A. [1 ]
Cheung, Ray C. C. [1 ]
Yan, Hong [1 ]
机构
[1] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
关键词
CUR decomposition; low-rank decomposition; high level synthesis; SVD and QR decomposition;
D O I
10.1109/FPL57034.2022.00052
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A matrix factorization is to decompose a matrix into a product of smaller matrices. It is widely used in machine learning algorithms. There are many matrix decomposition algorithms, and each has various applications. CUR matrix decomposition is a widely-used factorization tool that has been employed for dimension reduction and pattern recognition in many scientific and engineering applications, such as image processing, text mining, and wireless communications. In this paper we propose an efficient FPGA-based floating-point accelerator using high-level synthesis (HLS) for the CUR decomposition algorithm. Our experiment results demonstrate the better efficiency of our hardware design compared to the optimized CPU-based software solutions. The speedup of our FPGA-based architecture over the optimized software implementation ranges from 2.37 to 16.82 times for different dimensions of the data input matrix. We evaluated our design using large dimension matrices 1024x1024 and 2048 x 2048 and the experiment results demonstrated the efficiency of our design in terms of the utilized resources and latency. Finally, we have compared our design with other matrix decomposition algorithms such as SVD and QR decomposition, the experiment results demonstrated that CUR is more efficient than SVD and QR decomposition in terms of latency and required resources.
引用
收藏
页码:294 / 299
页数:6
相关论文
共 50 条
  • [1] High-Performance FPGA Accelerator for SIKE
    El Khatib, Rami
    Azarderakhsh, Reza
    Mozaffari-Kermani, Mehran
    IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (06) : 1237 - 1248
  • [2] A High-performance FPGA-based Accelerator for Gradient Compression
    Ren, Qingqing
    Zhu, Shuyong
    Meng, Xuying
    Zhang, Yujun
    DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 429 - 438
  • [3] High-Performance of Eigenvalue Decomposition on FPGA for the DOA Estimation
    Zhang, Xiao-Wei
    Yan, Di
    Zuo, Lei
    Li, Ming
    Guo, Jian-Xin
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5782 - 5797
  • [4] High-Performance FPGA-based Accelerator for Bayesian Neural Networks
    Fan, Hongxiang
    Ferianc, Martin
    Rodrigues, Miguel
    Zhou, Hongyu
    Niu, Xinyu
    Luk, Wayne
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 1063 - 1068
  • [5] A High-Performance FPGA-Based Depthwise Separable Convolution Accelerator
    Huang, Jiye
    Liu, Xin
    Guo, Tongdong
    Zhao, Zhijin
    ELECTRONICS, 2023, 12 (07)
  • [6] ADD: Accelerator Design and Deploy - A tool for FPGA high-performance dataflow computing
    Penha, Jeronimo C.
    Silva, Lucas B.
    Silva, Jansen M.
    Coelho, Kristtopher K.
    Baranda, Hector P.
    Nacif, Jose Augusto M.
    Ferreira, Ricardo S.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (18):
  • [7] High-Performance Mixed-Low-Precision CNN Inference Accelerator on FPGA
    Wang, Junbin
    Fang, Shaoxia
    Wang, Xi
    Ma, Jiangsha
    Wang, Taobo
    Shan, Yi
    IEEE MICRO, 2021, 41 (04) : 31 - 38
  • [8] Work-in-Progress: A High-performance FPGA Accelerator for Sparse Neural Networks
    Lu, Yuntao
    Gong, Lei
    Xu, Chongchong
    Sun, Fan
    Zhang, Yiwei
    Wang, Chao
    Zhou, Xuehai
    2017 INTERNATIONAL CONFERENCE ON COMPILERS, ARCHITECTURES AND SYNTHESIS FOR EMBEDDED SYSTEMS (CASES), 2017,
  • [9] FPGA-Based High-Performance Data Compression Deep Neural Network Accelerator
    Wang, Hanze
    Fu, Yingxun
    Ma, Li
    2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022), 2022, : 563 - 569
  • [10] FPGA-based hardware accelerator for high-performance data-stream processing
    Lysakov K.F.
    Shadrin M.Y.
    Pattern Recognition and Image Analysis, 2013, 23 (1) : 26 - 34