A De Novo Genome Assembly Algorithm for Repeats and Nonrepeats

被引:5
|
作者
Lian, Shuaibin [1 ]
Li, Qingyan [1 ]
Dai, Zhiming [1 ,2 ]
Xiang, Qian [1 ]
Dai, Xianhua [1 ]
机构
[1] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
[2] SYSU CMU Shunde Int Joint Res Inst, Shunde 528300, Peoples R China
关键词
SEQUENCING TECHNOLOGIES; STRUCTURAL VARIATION; AMPLIFICATION; DNA; IDENTIFICATION;
D O I
10.1155/2014/736473
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background. Next generation sequencing platforms can generate shorter reads, deeper coverage, and higher throughput than those of the Sanger sequencing. These short reads may be assembled de novo before some specific genome analyses. Up to now, the performances of assembling repeats of these current assemblers are very poor. Results. To improve this problem, we proposed a new genome assembly algorithm, named SWA, which has four properties: (1) assembling repeats and nonrepeats; (2) adopting a new overlapping extension strategy to extend each seed; (3) adopting sliding window to filter out the sequencing bias; and (4) proposing a compensational mechanism for low coverage datasets. SWA was evaluated and validated in both simulations and real sequencing datasets. The accuracy of assembling repeats and estimating the copy numbers is up to 99% and 100%, respectively. Finally, the extensive comparisons with other eight leading assemblers show that SWA outperformed others in terms of completeness and correctness of assembling repeats and nonrepeats. Conclusions. This paper proposed a new de novo genome assembly method for resolving complex repeats. SWA not only can detect where repeats or nonrepeats are but also can assemble them completely from NGS data, especially for assembling repeats. This is the advantage over other assemblers.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] A De Novo Genome Assembly Algorithm for Repeats and Nonrepeats (vol 2014, 736473, 2014)
    Lian, Shuaibin
    Li, Qingyan
    Dai, Zhiming
    Xiang, Qian
    Dai, Xianhua
    BIOMED RESEARCH INTERNATIONAL, 2014, 2014
  • [2] Towards Accurate De Novo Assembly for Genomes with Repeats
    Bucur, Doina
    2017 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2017, : 106 - +
  • [4] Exploiting sparseness in de novo genome assembly
    Ye, Chengxi
    Ma, Zhanshan Sam
    Cannon, Charles H.
    Pop, Mihai
    Yu, Douglas W.
    BMC BIOINFORMATICS, 2012, 13
  • [5] De novo assembly of a Chinese soybean genome
    Shen, Yanting
    Liu, Jing
    Geng, Haiying
    Zhang, Jixiang
    Liu, Yucheng
    Zhang, Haikuan
    Xing, Shilai
    Du, Jianchang
    Ma, Shisong
    Tian, Zhixi
    SCIENCE CHINA-LIFE SCIENCES, 2018, 61 (08) : 871 - 884
  • [6] De novo assembly of a Chinese soybean genome
    Yanting Shen
    Jing Liu
    Haiying Geng
    Jixiang Zhang
    Yucheng Liu
    Haikuan Zhang
    Shilai Xing
    Jianchang Du
    Shisong Ma
    Zhixi Tian
    Science China(Life Sciences), 2018, 61 (08) : 871 - 884
  • [7] Exploiting sparseness in de novo genome assembly
    Chengxi Ye
    Zhanshan Sam Ma
    Charles H Cannon
    Mihai Pop
    Douglas W Yu
    BMC Bioinformatics, 13
  • [8] De novo assembly of a Chinese soybean genome
    Yanting Shen
    Jing Liu
    Haiying Geng
    Jixiang Zhang
    Yucheng Liu
    Haikuan Zhang
    Shilai Xing
    Jianchang Du
    Shisong Ma
    Zhixi Tian
    Science China Life Sciences, 2018, 61 : 871 - 884
  • [9] AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references
    Bao, Ergude
    Jiang, Tao
    Girke, Thomas
    BIOINFORMATICS, 2014, 30 (12) : 319 - 328
  • [10] De novo assembly and phasing of a Korean human genome
    Jeong-Sun Seo
    Arang Rhie
    Junsoo Kim
    Sangjin Lee
    Min-Hwan Sohn
    Chang-Uk Kim
    Alex Hastie
    Han Cao
    Ji-Young Yun
    Jihye Kim
    Junho Kuk
    Gun Hwa Park
    Juhyeok Kim
    Hanna Ryu
    Jongbum Kim
    Mira Roh
    Jeonghun Baek
    Michael W. Hunkapiller
    Jonas Korlach
    Jong-Yeon Shin
    Changhoon Kim
    Nature, 2016, 538 : 243 - 247