G-FQZip: Lossless Reference-Based Compression of FASTQ files Using GPUs

被引:0
|
作者
Peng, Cong [1 ]
Deng, Qingjin [1 ]
Huang, Zhi-An [1 ]
Sun, Yiwen [2 ]
Zhu, Zexuan [1 ]
机构
[1] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
[2] Shenzhen Univ, Sch Med, Shenzhen 518060, Peoples R China
基金
中国国家自然科学基金;
关键词
GPU acceleration; Reference-based DNA sequence compression; High-throughput sequencing; Lossless compression; READ ALIGNMENT; SEQUENCES;
D O I
10.1109/CIS.2017.00128
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The exponentially increasing high throughput of sequencing data calls for efficient specific compression methods to address the challenges posed by the storage and transmission of such data. In this work, we develop a GPU version of lossless reference-based compression method namely G-FQZip by introducing a GPU-based arithmetic coding, a template matching approach, and a parallel light-weight mapping model. The comparison experiments demonstrate that G-FQZip can improve the (de)compression speed while maintaining comparable compression ratios. Besides, the follow-up evaluation demonstrated the efficiency of the GPUbased arithmetic coding and the template matching approach.
引用
收藏
页码:553 / 556
页数:4
相关论文
共 50 条
  • [1] LW-FQZip 2: a parallelized reference-based compression of FASTQ files
    Huang, Zhi-An
    Wen, Zhenkun
    Deng, Qingjin
    Chu, Ying
    Sun, Yiwen
    Zhu, Zexuan
    [J]. BMC BIOINFORMATICS, 2017, 18
  • [2] LW-FQZip 2: a parallelized reference-based compression of FASTQ files
    Zhi-An Huang
    Zhenkun Wen
    Qingjin Deng
    Ying Chu
    Yiwen Sun
    Zexuan Zhu
    [J]. BMC Bioinformatics, 18
  • [3] FQZip: Lossless Reference-Based Compression of Next Generation Sequencing Data in FASTQ Format
    Zhang, Yongpeng
    Li, Linsen
    Xiao, Jun
    Yang, Yanli
    Zhu, Zexuan
    [J]. PROCEEDINGS OF THE 18TH ASIA PACIFIC SYMPOSIUM ON INTELLIGENT AND EVOLUTIONARY SYSTEMS, VOL 2, 2015, : 127 - 135
  • [4] RENANO: a REference-based compressor for NANOpore FASTQ files
    Dufort y Alvarez, Guillermo
    Seroussi, Gadiel
    Smircich, Pablo
    Sotelo-Silveira, Jose
    Ochoa, Idoia
    Martin, Alvaro
    [J]. BIOINFORMATICS, 2021, 37 (24) : 4862 - 4864
  • [5] LFQC: a lossless compression algorithm for FASTQ files
    Nicolae, Marius
    Pathak, Sudipta
    Rajasekaran, Sanguthevar
    [J]. BIOINFORMATICS, 2015, 31 (20) : 3276 - 3281
  • [6] Light-weight reference-based compression of FASTQ data
    Yongpeng Zhang
    Linsen Li
    Yanli Yang
    Xiao Yang
    Shan He
    Zexuan Zhu
    [J]. BMC Bioinformatics, 16
  • [7] Light-weight reference-based compression of FASTQ data
    Zhang, Yongpeng
    Li, Linsen
    Yang, Yanli
    Yang, Xiao
    He, Shan
    Zhu, Zexuan
    [J]. BMC BIOINFORMATICS, 2015, 16
  • [8] Correction to: IonCRAM: a reference-based compression tool for ion torrent sequence files
    Moustafa Shokrof
    Mohamed Abouelhoda
    [J]. BMC Bioinformatics, 21
  • [9] LFastqC: A lossless non-reference-based FASTQ compressor
    Al Yami, Sultan
    Huang, Chun-Hsi
    [J]. PLOS ONE, 2019, 14 (11):
  • [10] LFQC: A lossless compression algorithm for FASTQ files (vol 35, pg e1, 2019)
    Pathak, Sudipta
    Rajasekaran, Sanguthevar
    [J]. BIOINFORMATICS, 2020, 36 (22-23) : 5566 - 5566