Large-Scale Pairwise Alignments on GPU Clusters: Exploring the Implementation Space

被引:0
|
作者
Huan Truong
Da Li
Kittisak Sajjapongse
Gavin Conant
Michela Becchi
机构
[1] University of Missouri,MU Informatics Institute
[2] University of Missouri,Department of Electrical and Computer Engineering
[3] University of Missouri,Division of Animal Sciences
来源
关键词
Heterogeneous system; Sequence alignment; GPU;
D O I
暂无
中图分类号
学科分类号
摘要
Several problems in computational biology require the all-against-all pairwise comparisons of tens of thousands of individual biological sequences. Each such comparison can be performed with the well-known Needleman-Wunsch alignment algorithm. However, with the rapid growth of biological databases, performing all possible comparisons with this algorithm in serial becomes extremely time-consuming. The massive computational power of graphics processing units (GPUs) makes them an appealing choice for accelerating these computations. As such, CPU-GPU clusters can enable all-against-all comparisons on large datasets. In this work, we present four GPU implementations for large-scale pairwise sequence alignment: TiledDScan-mNW, DScan-mNW, RScan-mNW and LazyRScan-mNW. The proposed GPU kernels exhibit different parallelization patterns: we discuss how each parallelization strategy affects the memory accesses and the utilization of the underlying GPU hardware. We evaluate our implementations on a variety of low- and high-end GPUs with different compute capabilities. Our results show that all the proposed solutions outperform the existing open-source implementation from the Rodinia Benchmark Suite, and LazyRScan-mNW is the preferred solution for applications that require performing the trace-back operation only on a subset of the considered sequence pairs (for example, the pairs whose alignment score exceeds a predefined threshold). Finally, we discuss the integration of the proposed GPU kernels into a hybrid MPI-CUDA framework for deployment on CPU-GPU clusters. In particular, our proposed distributed design targets both homogeneous and heterogeneous clusters with nodes that differ amongst themselves in their hardware configuration.
引用
收藏
页码:131 / 149
页数:18
相关论文
共 50 条
  • [31] Parallelization of MAFFT for large-scale multiple sequence alignments
    Nakamura, Tsukasa
    Yamada, Kazunori D.
    Tomii, Kentaro
    Katoh, Kazutaka
    [J]. BIOINFORMATICS, 2018, 34 (14) : 2490 - 2492
  • [32] Lattice Boltzmann for Large-Scale GPU Systems
    Gray, Alan
    Hart, Alistair
    Richardson, Alan
    Stratford, Kevin
    [J]. APPLICATIONS, TOOLS AND TECHNIQUES ON THE ROAD TO EXASCALE COMPUTING, 2012, 22 : 167 - 174
  • [33] G-Meta: Distributed Meta Learning in GPU Clusters for Large-Scale Recommender Systems
    Xiao, Youshao
    Zhao, Shangchun
    Zhou, Zhenglei
    Huan, Zhaoxin
    Ju, Lin
    Zhang, Xiaolu
    Wang, Lin
    Zhou, Jun
    [J]. PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 4365 - 4369
  • [34] Computing Large-scale Distance Matrices on GPU
    Arefin, Ahmed Shamsul
    Riveros, Carlos
    Berretta, Regina
    Moscato, Pablo
    [J]. PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 576 - 580
  • [35] Machine learning for ultrafast X-ray diffraction patterns on large-scale GPU clusters
    Ekeberg, Tomas
    Engblom, Stefan
    Liu, Jing
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2015, 29 (02): : 233 - 243
  • [36] Large Scale Simulations of the Euler Equations on GPU Clusters
    Liebmann, Manfred
    Douglas, Craig C.
    Haase, Gundolf
    Horvath, Zoltan
    [J]. PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES 2010), 2010, : 50 - 54
  • [37] Large-scale PACS implementation
    Carrino, JA
    Unkel, PJ
    Miller, ID
    Bowser, CL
    Freckleton, MW
    Johnson, TG
    [J]. JOURNAL OF DIGITAL IMAGING, 1998, 11 (03) : 3 - 7
  • [38] Large-scale PACS implementation
    John A. Carrino
    Paul J. Unkel
    Ira D. Miller
    Cindy L. Bowser
    Michael W. Freckleton
    Thomas G. Johnson
    [J]. Journal of Digital Imaging, 1998, 11 : 3 - 7
  • [39] Design and Implementation of a Runtime System for Parallel Numerical Simulations on Large-Scale Clusters
    Schliephake, Michael
    Aguilar, Xavier
    Laure, Erwin
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS), 2011, 4 : 2105 - 2114
  • [40] Large-scale spectral clustering based on pairwise constraints
    Semertzidis, T.
    Rafailidis, D.
    Strintzis, M. G.
    Daras, P.
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2015, 51 (05) : 616 - 624