Scalable hardware architecture for fast gradient boosted tree training

被引：0

作者：

Sadasue T. ^{[1
,2
]}

Tanaka T. ^{[1
]}

Kasahara R. ^{[1
]}

Darmawan A. ^{[2
]}

Isshiki T. ^{[2
]}

机构：

[1] Innovation R&D Division, RICOH Company, Ebina, Kanagawa

[2] Information and Communications Engineering, Tokyo Institute of Technology, Ohta, Tokyo

来源：

IPSJ Transactions on System LSI Design Methodology | 2021年 / 14卷

关键词：

Acceleration; FPGA; Gradient Boosted Tree; Hardware description language; Machine learning;

D O I：

10.2197/IPSJTSLDM.14.11

中图分类号：

学科分类号：

摘要：

Gradient Boosted Tree is a powerful machine learning method that supports both classification and regression, and is widely used in fields requiring high-precision prediction, particularly for various types of tabular data sets. Owing to the recent increase in data size, the number of attributes, and the demand for frequent model updates, a fast and efficient training is required. FPGA is suitable for acceleration with power efficiency because it can realize a domain specific hardware architecture; however it is necessary to flexibly support many hyper-parameters to adapt to various dataset sizes, dataset properties, and system limitations such as memory capacity and logic capacity. We introduce a fully pipelined hardware implementation of Gradient Boosted Tree training and a design framework that enables a versatile hardware system description with high performance and flexibility to realize highly parameterized machine learning models. Experimental results show that our FPGA implementation achieves a 11- to 33-times faster performance and more than 300-times higher power efficiency than a state-of-the-art GPU accelerated software implementation. © 2021 Information Processing Society of Japan.

引用

页码：11 / 20

页数：9

共 50 条

[31] Scalable and fast SVM regression using modern hardware
Zeyi Wen
Rui Zhang
Kotagiri Ramamohanarao
Li Yang
World Wide Web, 2018, 21 : 261 - 287
[32] Hollow-tree super: A directional and scalable approach for feature importance in boosted tree models
Doyen, Stephane
Taylor, Hugh
Nicholas, Peter
Crawford, Lewis
Young, Isabella
Sughrue, Michael E.
PLOS ONE, 2021, 16 (10):
[33] Squirrel: A Scalable Secure Two-Party Computation Framework for Training Gradient Boosting Decision Tree
Lu, Wen-Jie
Huang, Zhicong
Zhang, Qizhi
Wang, Yuchen
Hong, Cheng
PROCEEDINGS OF THE 32ND USENIX SECURITY SYMPOSIUM, 2023, : 6435 - 6451
[34] Scalable Hardware Accelerator for Mini-Batch Gradient Descent
Rasoori, Sandeep
Akella, Venkatesh
PROCEEDINGS OF THE 2018 GREAT LAKES SYMPOSIUM ON VLSI (GLSVLSI'18), 2018, : 159 - 164
[35] Novel hardware architecture for fast address lookups
Mehrotra, P
Franzon, PD
IEEE COMMUNICATIONS MAGAZINE, 2002, 40 (11) : 66 - 71
[36] Novel hardware architecture for fast address lookups
Mehrotra, P
Franzon, PD
HPSR 2002: WORKSHOP ON HIGH PERFORMANCE SWITCHING AND ROUTING, PROCEEDINGS: MERGING OPTICAL AND IP TECHNOLOGIES, 2002, : 105 - 110
[37] A scalable hardware architecture to support applications of the HAIPE 3.1 standard
Boorman, Brian C.
Mackey, Christopher D.
Kurdziel, Michael T.
2007 IEEE MILITARY COMMUNICATIONS CONFERENCE, VOLS 1-8, 2007, : 711 - 718
[38] A Scalable GPT-2 Inference Hardware Architecture on FPGA
Yemme, Anil
Garani, Shayan Srinivasa
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[39] Scalable Hardware Architecture for Invertible Logic with Sparse Hamiltonian Matrices
Onizawa, Naoya
Tamakoshi, Akira
Hanyu, Takahiro
2021 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2021), 2021, : 223 - 228
[40] A Multi-Application, Scalable and Adaptable Hardware SOM Architecture
Abadi, Mehdi
Jovanovic, Slavisa
Ben Khalifa, Khaled
Weber, Serge
Bedoui, Mohamed Hedi
2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,

← 1 2 3 4 5 →