Scalable in-memory processing of omics workflows

被引:0
|
作者
Elisseev, Vadim [1 ,2 ]
Gardiner, Laura-Jayne [1 ]
Krishna, Ritesh [1 ]
机构
[1] Hartree Ctr, Daresbury Lab, IBM Res Europe, Keckwick Lane, Warrington WA4 4AD, England
[2] Wrexham Glyndwr Univ, Mold Rd, Wrexham LL11 2AW, Wales
关键词
Bioinformatics; HPC; Key-value store; Machine learning; Cloud; Metagenomics;
D O I
10.1016/j.csbj.2022.04.014
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
We present a proof of concept implementation of the in-memory computing paradigm that we use to facilitate the analysis of metagenomic sequencing reads. In doing so we compare the performance of POSIXTMfile systems and key-value storage for omics data, and we show the potential for integrating high-performance computing (HPC) and cloud native technologies. We show that in-memory key value storage offers possibilities for improved handling of omics data through more flexible and faster data processing. We envision fully containerized workflows and their deployment in portable micro pipelines with multiple instances working concurrently with the same distributed in-memory storage. To highlight the potential usage of this technology for event driven and real-time data processing, we use a biological case study focused on the growing threat of antimicrobial resistance (AMR). We develop a workflow encompassing bioinformatics and explainable machine learning (ML) to predict life expectancy of a population based on the microbiome of its sewage while providing a description of AMR contribution to the prediction. We propose that in future, performing such analyses in 'real-time' would allow us to assess the potential risk to the population based on changes in the AMR profile of the community. (C)& nbsp;2022 The Author(s). Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.& nbsp;& nbsp;
引用
收藏
页码:1914 / 1924
页数:11
相关论文
共 50 条
  • [1] Scalable in-memory processing of omics workflows
    Elisseev, Vadim
    Gardiner, Laura-Jayne
    Krishna, Ritesh
    [J]. Computational and Structural Biotechnology Journal, 2022, 20 : 1914 - 1924
  • [2] Scalable In-Memory Transaction Processing with HTM
    Wu, Yingjun
    Tan, Kian-Lee
    [J]. PROCEEDINGS OF USENIX ATC '16: 2016 USENIX ANNUAL TECHNICAL CONFERENCE, 2016, : 365 - 377
  • [3] CoREC: Scalable and Resilient In-memory Data Staging for In-situ Workflows
    Duan, Shaohua
    Subedi, Pradeep
    Davis, Philip
    Teranishi, Keita
    Kolla, Hemanth
    Gamell, Marc
    Parashar, Manish
    [J]. ACM TRANSACTIONS ON PARALLEL COMPUTING, 2020, 7 (02)
  • [4] SparkBLAST: scalable BLAST processing using in-memory operations
    Marcelo Rodrigo de Castro
    Catherine dos Santos Tostes
    Alberto M. R. Dávila
    Hermes Senger
    Fabricio A. B. da Silva
    [J]. BMC Bioinformatics, 18
  • [5] SparkBLAST: scalable BLAST processing using in-memory operations
    de Castro, Marcelo Rodrigo
    Tostes, Catherine dos Santos
    Davila, Alberto M. R.
    Senger, Hermes
    da Silva, Fabricio A. B.
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [6] eAP: A Scalable and Efficient In-Memory Accelerator for Automata Processing
    Sadredini, Elaheh
    Rahimi, Reza
    Verma, Vaibhav
    Stan, Mircea
    Skadron, Kevin
    [J]. MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 87 - 99
  • [7] A Scalable and Efficient In-Memory Interconnect Architecture for Automata Processing
    Sadredini, Elaheh
    Rahimi, Reza
    Verma, Vaibhav
    Stan, Mircea
    Skadron, Kevin
    [J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2019, 18 (02) : 87 - 90
  • [8] Scalable In-Memory Computing
    Uta, Alexandru
    Sandu, Andreea
    Costache, Stefania
    Kielmann, Thilo
    [J]. 2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, : 805 - 810
  • [9] Memory Processing Unit for In-Memory Processing
    Ben Hur, Rotem
    Kvatinsky, Shahar
    [J]. PROCEEDINGS OF THE 2016 IEEE/ACM INTERNATIONAL SYMPOSIUM ON NANOSCALE ARCHITECTURES (NANOARCH), 2016, : 171 - 172
  • [10] NewSQL Databases and Scalable In-Memory Analytics
    Duggirala, Siddhartha
    [J]. DEEP DIVE INTO NOSQL DATABASES: THE USE CASES AND APPLICATIONS, 2018, 109 : 49 - 76