Scalable hardware architecture for fast gradient boosted tree training

被引：0

作者：

Sadasue T. ^{[1
,2
]}

Tanaka T. ^{[1
]}

Kasahara R. ^{[1
]}

Darmawan A. ^{[2
]}

Isshiki T. ^{[2
]}

机构：

[1] Innovation R&D Division, RICOH Company, Ebina, Kanagawa

[2] Information and Communications Engineering, Tokyo Institute of Technology, Ohta, Tokyo

来源：

IPSJ Transactions on System LSI Design Methodology | 2021年 / 14卷

关键词：

Acceleration; FPGA; Gradient Boosted Tree; Hardware description language; Machine learning;

D O I：

10.2197/IPSJTSLDM.14.11

中图分类号：

学科分类号：

摘要：

Gradient Boosted Tree is a powerful machine learning method that supports both classification and regression, and is widely used in fields requiring high-precision prediction, particularly for various types of tabular data sets. Owing to the recent increase in data size, the number of attributes, and the demand for frequent model updates, a fast and efficient training is required. FPGA is suitable for acceleration with power efficiency because it can realize a domain specific hardware architecture; however it is necessary to flexibly support many hyper-parameters to adapt to various dataset sizes, dataset properties, and system limitations such as memory capacity and logic capacity. We introduce a fully pipelined hardware implementation of Gradient Boosted Tree training and a design framework that enables a versatile hardware system description with high performance and flexibility to realize highly parameterized machine learning models. Experimental results show that our FPGA implementation achieves a 11- to 33-times faster performance and more than 300-times higher power efficiency than a state-of-the-art GPU accelerated software implementation. © 2021 Information Processing Society of Japan.

引用

页码：11 / 20

页数：9

共 50 条

[1] Scalable Full Hardware Logic Architecture for Gradient Boosted Tree Training
Sadasue, Tamon
Isshiki, Tsuyoshi
28TH IEEE INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2020, : 234 - 234
[2] Efficient Gradient Boosted Decision Tree Training on GPUs
Wen, Zeyi
He, Bingsheng
Ramamohanarao, Kotagiri
Lu, Shengliang
Shi, Jiashuai
2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 234 - 243
[3] SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems
Iosipoi, Leonid
Vakhrushev, Anton
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[4] Scalable training on scalable infrastructures for programmable hardware
Lorusso, Marco
Bonacorsi, Daniele
Travaglini, Riccardo
Salomoni, Davide
Veronesi, Paolo
Michelotto, Diego
Mariotti, Mirko
Bianchini, Giulio
Costantini, Alessandro
Duma, Doina Cristina
26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY AND NUCLEAR PHYSICS, CHEP 2023, 2024, 295
[5] A 1096fps Hardware Architecture for Fast Training in Object Tracking
Lv, Yun
Mo, Huiyu
Liu, Leibo
Yin, Shouyi
Wei, Shaojun
Zhu, Wenping
Li, Qiang
2019 IEEE 11TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2019), 2019, : 258 - 262
[6] Scalable Feature Selection for (Multitask) Gradient Boosted Trees
Han, Cuize
Rao, Nikhil
Sorokina, Daria
Subbian, Karthik
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 885 - 893
[7] A highly regular and scalable AES hardware architecture
Mangard, S
Aigner, M
Dominikus, S
IEEE TRANSACTIONS ON COMPUTERS, 2003, 52 (04) : 483 - 491
[8] A new scalable hardware architecture for RSA algorithm
Guedue, Tamer
2007 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS, VOLS 1 AND 2, 2007, : 670 - 674
[9] A scalable hardware architecture for prime number validation
Cheung, RCC
Brown, A
Luk, W
Cheung, PYK
2004 IEEE INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY, PROCEEDINGS, 2004, : 177 - 184
[10] Gradient hardware considerations for fast MRI
Schmitt, F
ULTRAFAST MAGNETIC RESONANCE IMAGING IN MEDICINE, 1999, 1192 : 3 - 10

← 1 2 3 4 5 →