GPU-acceleration of the distributed-memory database peptide search of mass spectrometry data

被引:0
|
作者
Haseeb, Muhammad [1 ]
Saeed, Fahad [1 ,2 ,3 ]
机构
[1] Florida Int Univ FIU, Knight Fdn Sch Comp & Informat Sci, Miami, FL 33199 USA
[2] Biomol Sci Inst BSI, Miami, FL 33199 USA
[3] Florida Int Univ, Herbert Wertheim Sch Med, Dept Human & Mol Genet, Miami, FL 33199 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
TANDEM; IDENTIFICATION; SEQUENCES; ULTRAFAST;
D O I
10.1038/s41598-023-43033-w
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Database peptide search is the primary computational technique for identifying peptides from the mass spectrometry (MS) data. Graphical Processing Units (GPU) computing is now ubiquitous in the current-generation of high-performance computing (HPC) systems, yet its application in the database peptide search domain remains limited. Part of the reason is the use of sub-optimal algorithms in the existing GPU-accelerated methods resulting in significantly inefficient hardware utilization. In this paper, we design and implement a new-age CPU-GPU HPC framework, called GiCOPS, for efficient and complete GPU-acceleration of the modern database peptide search algorithms on supercomputers. Our experimentation shows that the GiCOPS exhibits between 1.2 to 5x speed improvement over its CPU-only predecessor, HiCOPS, and over 10x improvement over several existing GPU-based database search algorithms for sufficiently large experiment sizes. We further assess and optimize the performance of our framework using the Roofline Model and report near-optimal results for several metrics including computations per second, occupancy rate, memory workload, branch efficiency and shared memory performance. Finally, the CPU-GPU methods and optimizations proposed in our work for complex integer- and memory-bounded algorithmic pipelines can also be extended to accelerate the existing and future peptide identification algorithms. GiCOPS is now integrated with our umbrella HPC framework HiCOPS and is available at: https://github.com/pcdslab/gicops.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Representing shared data on distributed-memory parallel computers
    Herley, KT
    MATHEMATICAL SYSTEMS THEORY, 1996, 29 (02): : 111 - 156
  • [22] Peptide isoelectric point filtering of tandem mass spectrometry search data
    Uwaje, N.
    Maccarrone, G.
    Turck, C.
    MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (08) : S33 - S33
  • [23] PARALLEL RENDERING OF VOLUMETRIC DATA SET ON DISTRIBUTED-MEMORY ARCHITECTURES
    MONTANI, C
    PEREGO, R
    SCOPIGNO, R
    CONCURRENCY-PRACTICE AND EXPERIENCE, 1993, 5 (02): : 153 - 167
  • [24] A Distributed-Memory Algorithm for Connected Components Labeling of Simulation Data
    Harrison, Cyrus
    Weiler, Jordan
    Bleile, Ryan
    Gaither, Kelly
    Childs, Hank
    TOPOLOGICAL AND STATISTICAL METHODS FOR COMPLEX DATA: TACKLING LARGE-SCALE, HIGH-DIMENSIONAL, AND MULTIVARIATE DATA SPACES, 2015, : 3 - 19
  • [25] SUPPORTING DYNAMIC DATA-STRUCTURES ON DISTRIBUTED-MEMORY MACHINES
    ROGERS, A
    CARLISLE, MC
    REPPY, JH
    HENDREN, LJ
    ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 1995, 17 (02): : 233 - 263
  • [26] THE DATA ALIGNMENT PHASE IN COMPILING PROGRAMS FOR DISTRIBUTED-MEMORY MACHINES
    LI, JK
    CHEN, M
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1991, 13 (02) : 213 - 221
  • [27] Effects of multithreading on data and workload distribution for distributed-memory multiprocessors
    Sohn, A
    Sato, M
    Yoo, N
    Gaudiot, JL
    10TH INTERNATIONAL PARALLEL PROCESSING SYMPOSIUM - PROCEEDINGS OF IPPS '96, 1996, : 116 - 122
  • [28] ON DATA DEPENDENCE ANALYSIS FOR COMPILING PROGRAMS ON DISTRIBUTED-MEMORY MACHINES
    SHARMA, S
    HUANG, CH
    SADAYAPPAN, P
    SIGPLAN NOTICES, 1993, 28 (01): : 13 - 16
  • [29] ON AUTOMATIC LOOP DATA-MAPPING FOR DISTRIBUTED-MEMORY MULTIPROCESSORS
    TORRES, J
    AYGUADE, E
    LABARTA, J
    LLABERIA, JM
    VALERO, M
    LECTURE NOTES IN COMPUTER SCIENCE, 1991, 487 : 173 - 182
  • [30] SIMULTANEOUS COMPUTATIONAL AND DATA LOAD BALANCING IN DISTRIBUTED-MEMORY SETTING
    Celiktug, Mestan Firat
    Karsavuran, M. Ozan
    Acers, Seher
    Aykanat, Cevdet
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2022, 44 (06): : C399 - C424