Optimizing FPGA-based Accelerator Design for Large-Scale Molecular Similarity Search (Special Session Paper)

被引：4

作者：

Peng, Hongwu ^{[1
]}

Chen, Shiyang ^{[2
]}

Wang, Zhepeng ^{[3
]}

Yang, Junhuan ^{[4
]}

Weitze, Scott A. ^{[2
]}

Geng, Tong ^{[5
]}

Li, Ang ^{[5
]}

Bi, Jinbo ^{[1
]}

Song, Minghu ^{[1
]}

Jiang, Weiwen ^{[3
]}

Liu, Hang ^{[2
]}

Ding, Caiwen ^{[1
]}

机构：

[1] Univ Connecticut, Storrs, CT 06269 USA

[2] Stevens Inst Technol, Hoboken, NJ 07030 USA

[3] George Mason Univ, Fairfax, VA 22030 USA

[4] Univ New Mexico, Albuquerque, NM 87131 USA

[5] Pacific Northwest Natl Lab, Richland, WA 99352 USA

来源：

2021 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN (ICCAD) | 2021年

关键词：

ALGORITHMS;

D O I：

10.1109/ICCAD51958.2021.9643528

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Molecular similarity search has been widely used in drug discovery to identify structurally similar compounds from large molecular databases rapidly. With the increasing size of chemical libraries, there is growing interest in the efficient acceleration of large-scale similarity search. Existing works mainly focus on CPU and GPU to accelerate the computation of the Tanimoto coefficient in measuring the pairwise similarity between different molecular fingerprints. In this paper, we propose and optimize an FPGA-based accelerator design on exhaustive and approximate search algorithms. On exhaustive search using BitBound & folding, we analyze the similarity cutoff and folding level relationship with search speedup and accuracy, and propose a scalable on-the-fly query engine on FPGAs to reduce the resource utilization and pipeline interval. We achieve a 450 million compounds-per-second processing throughput for a single query engine. On approximate search using hierarchical navigable small world (HNSW), a popular algorithm with high recall and query speed. We propose an FPGA-based graph traversal engine to utilize a high throughput register array based priority queue and fine-grained distance calculation engine to increase the processing capability. Experimental results show that the proposed FPGA-based HNSW implementation has a 103385 query per second (QPS) on the Chembl database with 0.92 recall and achieves a 35x speedup than the existing CPU implementation on average. To the best of our knowledge, our FPGA-based implementation is the first attempt to accelerate molecular similarity search algorithms on FPGA and has the highest performance among existing approaches.

引用

页数：7

共 33 条

[1] A High Performance FPGA-based Accelerator for Large-Scale Convolutional Neural Networks
Li, Huimin
Fan, Xitian
Jiao, Li
Cao, Wei
Zhou, Xuegong
Wang, Lingli
[J]. 2016 26TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2016,
[2] FPGA-based Tabu Search for Detection in Large-Scale MIMO Systems
Wu, Yun
McAllister, John
[J]. PROCEEDINGS OF THE 2014 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS 2014), 2014, : 121 - 126
[3] FPGA-Based QBoost with Large-Scale Annealing Processor and Accelerated Hyperparameter Search
Takemoto, Takashi
Mertig, Normann
Hayashi, Masato
Susa-Tanaka, Saki
Teramoto, Hiroshi
Nakamura, Atsuyoshi
Takigawa, Ichigaku
Minato, Shin-ichi
Komatsuzaki, Tamiki
Yamaoka, Masanao
[J]. 2018 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2018,
[4] FPGA-based accelerator design for RankBoost in Web search engines
Xu, Ning-Yi
Cai, Xiong-Fei
Gao, Rui
Zhang, Lei
Hsu, Feng-Hsiung
[J]. ICFPT 2007: INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY, PROCEEDINGS, 2007, : 33 - 40
[5] Optimizing FPGA-Based CNN Accelerator Using Differentiable Neural Architecture Search
Fan, Hongxiang
Ferianc, Martin
Liu, Shuanglong
Que, Zhiqiang
Niu, Xinyu
Luk, Wayne
[J]. 2020 IEEE 38TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2020), 2020, : 465 - 468
[6] A graph-based cache for large-scale similarity search engines
Gil-Costa, Veronica
Marin, Mauricio
Bonacic, Carolina
Solar, Roberto
[J]. JOURNAL OF SUPERCOMPUTING, 2018, 74 (05): : 2006 - 2034
[7] A graph-based cache for large-scale similarity search engines
Veronica Gil-Costa
Mauricio Marin
Carolina Bonacic
Roberto Solar
[J]. The Journal of Supercomputing, 2018, 74 : 2006 - 2034
[8] A Parallel and Updatable Architecture for FPGA-Based Packet Classification With Large-Scale Rule Sets
Xin, Yao
Li, Wenjun
Xie, Gaogang
Xu, Yang
Wang, Yi
[J]. IEEE MICRO, 2023, 43 (02) : 110 - 119
[9] Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform
Baskin, Chaim
Zheltonozhskii, Evgenii
Bronstein, Alex M.
Mendelson, Avi
Liss, Natan
[J]. 2018 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2018), 2018, : 162 - 169
[10] ARUZ - Large-scale, massively parallel FPGA-based analyzer of real complex systems
Kielbik, Rafal
Halagan, Krzysztof
Zatorski, Witold
Jung, Jaroslaw
Ulanski, Jacek
Napieralski, Andrzej
Rudnicki, Kamil
Amrozik, Piotr
Jablonski, Grzegorz
Stozek, Dominik
Polanowski, Piotr
Mudza, Zbigniew
Kupis, Joanna
Panek, Przemyslaw
[J]. COMPUTER PHYSICS COMMUNICATIONS, 2018, 232 : 22 - 34

← 1 2 3 4 →