Scalable hardware architecture for fast gradient boosted tree training

被引:0
|
作者
Sadasue T. [1 ,2 ]
Tanaka T. [1 ]
Kasahara R. [1 ]
Darmawan A. [2 ]
Isshiki T. [2 ]
机构
[1] Innovation R&D Division, RICOH Company, Ebina, Kanagawa
[2] Information and Communications Engineering, Tokyo Institute of Technology, Ohta, Tokyo
来源
IPSJ Transactions on System LSI Design Methodology | 2021年 / 14卷
关键词
Acceleration; FPGA; Gradient Boosted Tree; Hardware description language; Machine learning;
D O I
10.2197/IPSJTSLDM.14.11
中图分类号
学科分类号
摘要
Gradient Boosted Tree is a powerful machine learning method that supports both classification and regression, and is widely used in fields requiring high-precision prediction, particularly for various types of tabular data sets. Owing to the recent increase in data size, the number of attributes, and the demand for frequent model updates, a fast and efficient training is required. FPGA is suitable for acceleration with power efficiency because it can realize a domain specific hardware architecture; however it is necessary to flexibly support many hyper-parameters to adapt to various dataset sizes, dataset properties, and system limitations such as memory capacity and logic capacity. We introduce a fully pipelined hardware implementation of Gradient Boosted Tree training and a design framework that enables a versatile hardware system description with high performance and flexibility to realize highly parameterized machine learning models. Experimental results show that our FPGA implementation achieves a 11- to 33-times faster performance and more than 300-times higher power efficiency than a state-of-the-art GPU accelerated software implementation. © 2021 Information Processing Society of Japan.
引用
收藏
页码:11 / 20
页数:9
相关论文
共 50 条
  • [31] Scalable and fast SVM regression using modern hardware
    Zeyi Wen
    Rui Zhang
    Kotagiri Ramamohanarao
    Li Yang
    World Wide Web, 2018, 21 : 261 - 287
  • [32] Hollow-tree super: A directional and scalable approach for feature importance in boosted tree models
    Doyen, Stephane
    Taylor, Hugh
    Nicholas, Peter
    Crawford, Lewis
    Young, Isabella
    Sughrue, Michael E.
    PLOS ONE, 2021, 16 (10):
  • [33] Squirrel: A Scalable Secure Two-Party Computation Framework for Training Gradient Boosting Decision Tree
    Lu, Wen-Jie
    Huang, Zhicong
    Zhang, Qizhi
    Wang, Yuchen
    Hong, Cheng
    PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 6435 - 6451
  • [34] Scalable Hardware Accelerator for Mini-Batch Gradient Descent
    Rasoori, Sandeep
    Akella, Venkatesh
    PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 159 - 164
  • [35] Novel hardware architecture for fast address lookups
    Mehrotra, P
    Franzon, PD
    IEEE COMMUNICATIONS MAGAZINE, 2002, 40 (11) : 66 - 71
  • [36] Novel hardware architecture for fast address lookups
    Mehrotra, P
    Franzon, PD
    HPSR 2002: WORKSHOP ON HIGH PERFORMANCE SWITCHING AND ROUTING, PROCEEDINGS: MERGING OPTICAL AND IP TECHNOLOGIES, 2002, : 105 - 110
  • [37] A scalable hardware architecture to support applications of the HAIPE 3.1 standard
    Boorman, Brian C.
    Mackey, Christopher D.
    Kurdziel, Michael T.
    2007 IEEE MILITARY COMMUNICATIONS CONFERENCE, VOLS 1-8, 2007, : 711 - 718
  • [38] A Scalable GPT-2 Inference Hardware Architecture on FPGA
    Yemme, Anil
    Garani, Shayan Srinivasa
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [39] Scalable Hardware Architecture for Invertible Logic with Sparse Hamiltonian Matrices
    Onizawa, Naoya
    Tamakoshi, Akira
    Hanyu, Takahiro
    2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 223 - 228
  • [40] A Multi-Application, Scalable and Adaptable Hardware SOM Architecture
    Abadi, Mehdi
    Jovanovic, Slavisa
    Ben Khalifa, Khaled
    Weber, Serge
    Bedoui, Mohamed Hedi
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,