SeedsGraph: an efficient assembler for next-generation sequencing data

被引:2
|
作者
Wang, Chunyu [1 ]
Guo, Maozu [1 ]
Liu, Xiaoyan [1 ]
Liu, Yang [1 ]
Zou, Quan [2 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, 92 West Dazhi St, Harbin 150001, Peoples R China
[2] Xiamen Univ, Dept Comp Sci, Xiamen 361005, Peoples R China
来源
BMC MEDICAL GENOMICS | 2015年 / 8卷
基金
高等学校博士学科点专项科研基金; 中国国家自然科学基金;
关键词
ALGORITHMS; GENOMES;
D O I
10.1186/1755-8794-8-S2-S13
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
DNA sequencing technology has been rapidly evolving, and produces a large number of short reads with a fast rising tendency. This has led to a resurgence of research in whole genome shotgun assembly algorithms. We start the assembly algorithm by clustering the short reads in a cloud computing framework, and the clustering process groups fragments according to their original consensus long-sequence similarity. We condense each group of reads to a chain of seeds, which is a kind of substring with reads aligned, and then build a graph accordingly. Finally, we analyze the graph to find Euler paths, and assemble the reads related in the paths into contigs, and then lay out contigs with mate-pair information for scaffolds. The result shows that our algorithm is efficient and feasible for a large set of reads such as in next-generation sequencing technology.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Extending KNIME for next-generation sequencing data analysis
    Jagla, Bernd
    Wiswedel, Bernd
    Coppee, Jean-Yves
    [J]. BIOINFORMATICS, 2011, 27 (20) : 2907 - 2909
  • [42] IntSIM: An Integrated Simulator of Next-Generation Sequencing Data
    Yuan, Xiguo
    Zhang, Junying
    Yang, Liying
    [J]. IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2017, 64 (02) : 441 - 451
  • [43] The Promises and Pitfalls of Next-Generation Sequencing Data in Phylogeography
    Carstens, Bryan
    Lemmon, Alan R.
    Lemmon, Emily Moriarty
    [J]. SYSTEMATIC BIOLOGY, 2012, 61 (05) : 713 - 715
  • [44] Computational classification of microRNAs in next-generation sequencing data
    Joshua Riback
    Artemis G. Hatzigeorgiou
    Martin Reczko
    [J]. Theoretical Chemistry Accounts, 2010, 125 : 637 - 642
  • [45] Model Testing of PluriTest with Next-Generation Sequencing Data
    Schulze, Markus
    Hoja, Sabine
    Winner, Beate
    Winkler, Juergen
    Edenhofer, Frank
    Riemenschneider, Markus J.
    [J]. STEM CELLS AND DEVELOPMENT, 2016, 25 (07) : 569 - 571
  • [46] NGSphy: phylogenomic simulation of next-generation sequencing data
    Escalona, Merly
    Rocha, Sara
    Posada, David
    [J]. BIOINFORMATICS, 2018, 34 (14) : 2506 - 2507
  • [47] Next-generation sequencing data analysis on cloud computing
    Taesoo Kwon
    Won Gi Yoo
    Won-Ja Lee
    Won Kim
    Dae-Won Kim
    [J]. Genes & Genomics, 2015, 37 : 489 - 501
  • [48] The Genome Assembly Model for Next-Generation Sequencing Data
    Wang, Yirong
    Wei, Chengdong
    Zhang, Xiaodong
    Cen, Tailin
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON APPLIED MATHEMATICS, MODELLING AND STATISTICS APPLICATION (AMMSA 2017), 2017, 141 : 97 - 101
  • [49] Next-Generation Anchor Based Phylogeny (NexABP): Constructing phylogeny from Next-generation sequencing data
    Tanmoy Roychowdhury
    Anchal Vishnoi
    Alok Bhattacharya
    [J]. Scientific Reports, 3
  • [50] Next-Generation Anchor Based Phylogeny (NexABP): Constructing phylogeny from Next-generation sequencing data
    Roychowdhury, Tanmoy
    Vishnoi, Anchal
    Bhattacharya, Alok
    [J]. SCIENTIFIC REPORTS, 2013, 3