Alignment of High-Throughput Sequencing Data Inside In-Memory Databases

被引:3
|
作者
Firnkorn, Daniel [1 ]
Knaup-Gregori, Petra [1 ]
Bermejo, Justo Lorenzo [1 ]
Ganzinger, Matthias [1 ]
机构
[1] Inst Med Biometry & Informat, Heidelberg, Germany
来源
关键词
In-Memory-Technology; DNA-Alignment; HANA; high-throughput sequencing; stored procedures;
D O I
10.3233/978-1-61499-432-9-476
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In times of high-throughput DNA sequencing techniques, performance-capable analysis of DNA sequences is of high importance. Computer supported DNA analysis is still an intensive time-consuming task. In this paper we explore the potential of a new In-Memory database technology by using SAP's High Performance Analytic Appliance (HANA). We focus on read alignment as one of the first steps in DNA sequence analysis. In particular, we examined the widely used Burrows-Wheeler Aligner (BWA) and implemented stored procedures in both, HANA and the free database system MySQL, to compare execution time and memory management. To ensure that the results are comparable, MySQL has been running in memory as well, utilizing its integrated memory engine for database table creation. We implemented stored procedures, containing exact and inexact searching of DNA reads within the reference genome GRCh37. Due to technical restrictions in SAP HANA concerning recursion, the inexact matching problem could not be implemented on this platform. Hence, performance analysis between HANA and MySQL was made by comparing the execution time of the exact search procedures. Here, HANA was approximately 27 times faster than MySQL which means, that there is a high potential within the new In-Memory concepts, leading to further developments of DNA analysis procedures in the future.
引用
收藏
页码:476 / 480
页数:5
相关论文
共 50 条
  • [41] QIIME allows analysis of high-throughput community sequencing data
    Caporaso, J. Gregory
    Kuczynski, Justin
    Stombaugh, Jesse
    Bittinger, Kyle
    Bushman, Frederic D.
    Costello, Elizabeth K.
    Fierer, Noah
    Pena, Antonio Gonzalez
    Goodrich, Julia K.
    Gordon, Jeffrey I.
    Huttley, Gavin A.
    Kelley, Scott T.
    Knights, Dan
    Koenig, Jeremy E.
    Ley, Ruth E.
    Lozupone, Catherine A.
    McDonald, Daniel
    Muegge, Brian D.
    Pirrung, Meg
    Reeder, Jens
    Sevinsky, Joel R.
    Tumbaugh, Peter J.
    Walters, William A.
    Widmann, Jeremy
    Yatsunenko, Tanya
    Zaneveld, Jesse
    Knight, Rob
    NATURE METHODS, 2010, 7 (05) : 335 - 336
  • [42] In-Memory Graph Databases for Web-Scale Data
    Castellana, Vito Giovanni
    Morari, Alessandro
    Weaver, Jesse
    Tumeo, Antonino
    Haglin, David
    Villa, Oreste
    Feo, John
    COMPUTER, 2015, 48 (03) : 24 - 35
  • [43] In-memory Representations of Databases via Succinct Data Structures
    Raman, Rajeev
    PODS'18: PROCEEDINGS OF THE 37TH ACM SIGMOD-SIGACT-SIGAI SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2018, : 323 - 324
  • [44] High-throughput Pairwise Alignment with the Wavefront Algorithm using Processing-in-Memory
    Diab, Safaa
    Nassereldine, Amir
    Alser, Mohammed
    Luna, Juan Gomez
    Mutlu, Onur
    El Hajj, Izzat
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2022), 2022, : 163 - 163
  • [45] Virtual high-throughput screening of molecular databases
    Seifert, Markus H. J.
    Kraus, Juergen
    Kramer, Bernd
    CURRENT OPINION IN DRUG DISCOVERY & DEVELOPMENT, 2007, 10 (03) : 298 - 307
  • [46] Blind normalization of public high-throughput databases
    Ohse, Sebastian
    Boerries, Melanie
    Busch, Hauke
    PEERJ COMPUTER SCIENCE, 2019,
  • [47] High-throughput sequencing and vaccine design
    Luciani, F.
    REVUE SCIENTIFIQUE ET TECHNIQUE-OFFICE INTERNATIONAL DES EPIZOOTIES, 2016, 35 (01): : 53 - 65
  • [48] DNA sequencing in high-throughput neuroanatomy
    Kebschull, Justus M.
    JOURNAL OF CHEMICAL NEUROANATOMY, 2019, 100
  • [49] High-throughput sequencing for algal systematics
    Oliveira, Mariana C.
    Repetti, Sonja I.
    Iha, Cintia
    Jackson, Christopher J.
    Diaz-Tapia, Pilar
    Lubiana, Karoline Magalhaes Ferreira
    Cassano, Valeria
    Costa, Joana F.
    Cremen, Ma Chiela M.
    Marcelino, Vanessa R.
    Verbruggen, Heroen
    EUROPEAN JOURNAL OF PHYCOLOGY, 2018, 53 (03) : 256 - 272
  • [50] Optimizing SELEX with high-throughput sequencing
    White, Brian S.
    Ozer, Abdullah
    Lis, John T.
    Shalloway, David
    CANCER RESEARCH, 2012, 72