Next Generation Indexing for Genomic Intervals

被引:6
|
作者
Jalili, Vahid [1 ]
Matteucci, Matteo [1 ]
Goecks, Jeremy [2 ]
Deldjoo, Yashar [1 ]
Ceri, Stefano [1 ]
机构
[1] Politecn Milan, DEIB, I-20133 Milan, Italy
[2] Oregon Hlth & Sci Univ, Dept Biomed Engn, Portland, OR 97239 USA
关键词
Index structures; efficient query processing; genomic data management; VARIABLE-LENGTH QUERIES;
D O I
10.1109/TKDE.2018.2871031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One-dimensional intervals incremental inverted index (Di4) is amulti-resolution, single-dimension indexing framework for efficient, scalable, and extensible computation of genomic interval expressions. The framework has a tri-layer architecture: the semantic layer provides orthogonal and genericmeans (including the support of user-defined function) of sense-making and higher-lever reasoning fromregion-based datasets; the logical layer provides building blocks for region calculus and topological relations between intervals; the physical layer abstracts from persistence technology and makes the model adaptable to variety of persistence technologies, spanning from small-scale (e.g., B+tree) to large-scale (e.g., LevelDB). The extensibility of Di4 to application scenarios is shown with an example of comparative evaluation of ChIP-seq and DNase-Seq replicates. Performance of Di4 is benchmarked for small and large scale scenarios under common bioinformatics application scenarios. Di4 is freely available from https://genometric.github.io/Di4.
引用
收藏
页码:2008 / 2021
页数:14
相关论文
共 50 条
  • [1] Indexing Next-Generation Sequencing data
    Jalili, Vahid
    Matteucci, Matteo
    Masseroli, Marco
    Ceri, Stefano
    INFORMATION SCIENCES, 2017, 384 : 90 - 109
  • [2] Genomic Counseling: Next Generation Counseling
    Mills, Rachel
    Haga, Susanne B.
    JOURNAL OF GENETIC COUNSELING, 2014, 23 (04) : 689 - 692
  • [3] Virtually sequenced: The next genomic generation
    Bains, W
    NATURE BIOTECHNOLOGY, 1996, 14 (06) : 711 - 713
  • [4] Virtually sequenced: The next genomic generation
    PA Consulting Group, Royston, Hertfordshire SG8 6DP, United Kingdom
    NAT. BIOTECHNOL., 6 (711-713):
  • [5] Virtually sequenced: The next genomic generation
    Bains, William
    1996, (14):
  • [6] NEXT-GENERATION REFERENCE INTERVALS FOR PEDIATRIC HEMATOLOGY
    Zierk, J.
    Hirschmann, J.
    Toddenroth, D.
    Arzideh, F.
    Streichert, T.
    Haeckel, R.
    Prokosch, H-U
    Rauh, M.
    Metzler, M.
    HAEMATOLOGICA, 2017, 102 : 458 - 459
  • [7] Next-generation reference intervals for pediatric hematology
    Zierk, Jakob
    Hirschmann, Johannes
    Toddenroth, Dennis
    Arzideh, Farhad
    Haeckel, Rainer
    Bertram, Alexander
    Cario, Holger
    Fruehwald, Michael C.
    Gross, Hans-Juergen
    Groening, Arndt
    Gruetzner, Stefanie
    Gscheidmeier, Thomas
    Hoff, Torsten
    Hoffmann, Reinhard
    Klauke, Rainer
    Krebs, Alexander
    Lichtinghagen, Ralf
    Muehlenbrock-Lenter, Sabine
    Neumann, Michael
    Noellke, Peter
    Niemeyer, Charlotte M.
    Razum, Oliver
    Ruf, Hans-Georg
    Steigerwald, Udo
    Streichert, Thomas
    Torge, Antje
    Rascher, Wolfgang
    Prokosch, Hans-Ulrich
    Rauh, Manfred
    Metzler, Markus
    CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2019, 57 (10) : 1595 - 1607
  • [8] Next generation tools for genomic data generation, distribution, and visualization
    Nix, David A.
    Di Sera, Tonya L.
    Dalley, Brian K.
    Milash, Brett A.
    Cundick, Robert M.
    Quinn, Kevin S.
    Courdy, Samir J.
    BMC BIOINFORMATICS, 2010, 11
  • [9] Next generation tools for genomic data generation, distribution, and visualization
    David A Nix
    Tonya L Di Sera
    Brian K Dalley
    Brett A Milash
    Robert M Cundick
    Kevin S Quinn
    Samir J Courdy
    BMC Bioinformatics, 11
  • [10] GENERATION OF GENOMIC RESOURCES FOR ATLANTIC STURGEON USING NEXT GENERATION SEQUENCING
    Hori, T.
    Stannard, J.
    Plouffe, D.
    Buchanan, J.
    Fast, M.
    AQUACULTURE, 2017, 472 : 149 - 149