The cost of maintaining exabytes of data produced by sequencing experiments every year has become a major issue in today’s genomic research. In spite of the increasing popularity of third-generation sequencing, the existing algorithms for compressing long reads exhibit a minor advantage over the general-purpose gzip. We present CoLoRd, an algorithm able to reduce the size of third-generation sequencing data by an order of magnitude without affecting the accuracy of downstream analyses.
机构:
Cent European Inst Technol CEITEC, Vet Res Inst, Genet & Reprod Biotechnol, Brno 62100, Czech RepublicCent European Inst Technol CEITEC, Vet Res Inst, Genet & Reprod Biotechnol, Brno 62100, Czech Republic
机构:
St Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg 199034, RussiaSt Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg 199034, Russia
Antipov, Dmitry
Korobeynikov, Anton
论文数: 0引用数: 0
h-index: 0
机构:
St Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg 199034, Russia
St Petersburg State Univ, Dept Stat Modelling, St Petersburg 199034, RussiaSt Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg 199034, Russia
Korobeynikov, Anton
McLean, Jeffrey S.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Washington, Dept Periodont, Seattle, WA 98195 USASt Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg 199034, Russia
McLean, Jeffrey S.
Pevzner, Pavel A.
论文数: 0引用数: 0
h-index: 0
机构:
St Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg 199034, Russia
Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92103 USASt Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg 199034, Russia