Hardware acceleration of BWA-MEM genomic short read mapping for longer read lengths

被引:86
|
作者
Houtgast, Ernst Joachim [1 ,2 ]
Sima, Vlad-Mihai [2 ]
Bertels, Koen [1 ]
Al-Ars, Zaid [1 ]
机构
[1] Delft Univ Technol, Comp Engn Lab, Mekelweg 4, NL-2628 CD Delft, Netherlands
[2] Bluebee, Laan Zuid Hoorn 57, NL-2289 DC Rijswijk, Netherlands
关键词
Acceleration; BWA-MEM; FPGA; GPU; Short read mapping; Systolic array;
D O I
10.1016/j.compbiolchem.2018.03.024
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We present our work on hardware accelerated genomics pipelines, using either FPGAs or GPUs to accelerate execution of BWA-MEM, a widely-used algorithm for genomic short read mapping. The mapping stage can take up to 40% of overall processing time for genomics pipelines. Our implementation offloads the Seed Extension function, one of the main BWA-MEM computational functions, onto an accelerator. Sequencers typically output reads with a length of 150 base pairs. However, read length is expected to increase in the near future. Here, we investigate the influence of read length on BWA-MEM performance using data sets with read length up to 400 base pairs, and introduce methods to ameliorate the impact of longer read length. For the industry-standard 150 base pair read length, our implementation achieves an up to two-fold increase in overall application-level performance for systems with at most twenty-two logical CPU cores. Longer read length requires commensurately bigger data structures, which directly impacts accelerator efficiency. The two-fold performance increase is sustained for read length of at most 250 base pairs. To improve performance, we perform a classification of the inefficiency of the underlying systolic array architecture. By eliminating idle regions as much as possible, efficiency is improved by up to +95%. Moreover, adaptive load balancing intelligently distributes work between host and accelerator to ensure use of an accelerator always results in performance improvement, which in GPU-constrained scenarios provides up to +45% more performance. (C) 2018 Elsevier Ltd. All rights reserved.
引用
收藏
页码:54 / 64
页数:11
相关论文
共 50 条
  • [21] Power-Efficient Accelerated Genomic Short Read Mapping on Heterogeneous Computing Platforms
    Houtgast, Ernst Joachim
    Sima, Vlad-Mihai
    Marchiori, Giacomo
    Bertels, Koen
    Al-Ars, Zaid
    2016 IEEE 24TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2016, : 28 - +
  • [22] ASAP: Accelerated Short Read Alignment on Programmable Hardware
    Banerjee, Subho S.
    el-Hadedy, Mohamed
    Lim, Jong B.
    Chen, Daniel
    Kalbarczyk, Zbigniew T.
    Chen, Deming
    Iyer, Ravishankar K.
    FPGA'17: PROCEEDINGS OF THE 2017 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE GATE ARRAYS, 2017, : 293 - 294
  • [23] Fast, Flexible Text Search Using Genomic Short-Read Mapping Model
    Kim, Sung-Hwan
    Cho, Hwan-Gue
    ETRI JOURNAL, 2016, 38 (03) : 518 - 528
  • [24] FPGA Implementation of a Short Read Mapping Accelerator
    Morshedi, Mostafa
    Noori, Hamid
    APPLIED RECONFIGURABLE COMPUTING, 2017, 10216 : 289 - 296
  • [25] GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping
    Alser, Mohammed
    Hassan, Hasan
    Xin, Hongyi
    Ergin, Oguz
    Mutlu, Onur
    Alkan, Can
    BIOINFORMATICS, 2017, 33 (21) : 3355 - 3363
  • [26] Cloud Based Short Read Mapping Service
    Dai, Dong
    Li, Xi
    Wang, Chao
    Zhou, Xuehai
    2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 601 - 604
  • [27] Acceleration of short and long DNA read mapping without loss of accuracy using suffix array
    Tarraga, Joaquin
    Arnau, Vicente
    Martinez, Hector
    Moreno, Raul
    Cazorla, Diego
    Salavert-Torres, Jose
    Blanquer-Espert, Ignacio
    Dopazo, Joaquin
    Medina, Ignacio
    BIOINFORMATICS, 2014, 30 (23) : 3396 - 3398
  • [28] Efficient Search over Genomic Short Read Data
    Zhang, Wangda
    Lin, Mengdi
    Ross, Kenneth A.
    PROCEEDINGS OF THE 32TH INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT, SSDBM 2020, 2020,
  • [29] A New Approach for Approximate Text Search Using Genomic Short-Read Mapping Model
    Kim, Sung-Hwan
    Cho, Hwan-Gue
    ACM IMCOM 2015, PROCEEDINGS, 2015,
  • [30] Faster Workflow, Lower Inputs and Longer Read Lengths for SMRT Sequencing of the Human Genome
    Cunningham, K.
    Oussenko, I.
    Deikus, G.
    Lenhart, J.
    Kurihara, L.
    Makarov, V.
    Smith, M.
    Sebra, R.
    Harkins, T.
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2018, 26 : 727 - 727