A De Novo Genome Assembly Algorithm for Repeats and Nonrepeats

被引:5
|
作者
Lian, Shuaibin [1 ]
Li, Qingyan [1 ]
Dai, Zhiming [1 ,2 ]
Xiang, Qian [1 ]
Dai, Xianhua [1 ]
机构
[1] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
[2] SYSU CMU Shunde Int Joint Res Inst, Shunde 528300, Peoples R China
关键词
SEQUENCING TECHNOLOGIES; STRUCTURAL VARIATION; AMPLIFICATION; DNA; IDENTIFICATION;
D O I
10.1155/2014/736473
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background. Next generation sequencing platforms can generate shorter reads, deeper coverage, and higher throughput than those of the Sanger sequencing. These short reads may be assembled de novo before some specific genome analyses. Up to now, the performances of assembling repeats of these current assemblers are very poor. Results. To improve this problem, we proposed a new genome assembly algorithm, named SWA, which has four properties: (1) assembling repeats and nonrepeats; (2) adopting a new overlapping extension strategy to extend each seed; (3) adopting sliding window to filter out the sequencing bias; and (4) proposing a compensational mechanism for low coverage datasets. SWA was evaluated and validated in both simulations and real sequencing datasets. The accuracy of assembling repeats and estimating the copy numbers is up to 99% and 100%, respectively. Finally, the extensive comparisons with other eight leading assemblers show that SWA outperformed others in terms of completeness and correctness of assembling repeats and nonrepeats. Conclusions. This paper proposed a new de novo genome assembly method for resolving complex repeats. SWA not only can detect where repeats or nonrepeats are but also can assemble them completely from NGS data, especially for assembling repeats. This is the advantage over other assemblers.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] De novo genome assembly and functional annotation for Fusarium langsethiae
    Ya Zuo
    Carol Verheecke-Vaessen
    Corentin Molitor
    Angel Medina
    Naresh Magan
    Fady Mohareb
    BMC Genomics, 23
  • [42] Current challenges in de novo plant genome sequencing and assembly
    Michael C Schatz
    Jan Witkowski
    W Richard McCombie
    Genome Biology, 13
  • [43] De novo Genome Assembly of the Raccoon Dog (Nyctereutes procyonoides)
    Chueca, Luis J.
    Kochmann, Judith
    Schell, Tilman
    Greve, Carola
    Janke, Axel
    Pfenninger, Markus
    Klimpel, Sven
    FRONTIERS IN GENETICS, 2021, 12
  • [44] De novo Assembly of the Brain Coral Platygyra sinensis Genome
    Pootakham, Wirulda
    Sonthirod, Chutima
    Naktang, Chaiwat
    Kongjandtre, Narinratana
    Putchim, Lalita
    Sangsrakru, Duangjai
    Yoocha, Thippawan
    Tangphatsornruang, Sithichoke
    FRONTIERS IN MARINE SCIENCE, 2021, 8
  • [45] Current challenges in de novo plant genome sequencing and assembly
    Schatz, Michael C.
    Witkowski, Jan
    McCombie, W. Richard
    GENOME BIOLOGY, 2012, 13 (04):
  • [46] An External Memory Approach for Large Genome De Novo Assembly
    de Armas, Elvismary Molina
    Lifschitz, Sergio
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2022, 2022, 13523 : 79 - 90
  • [47] DE NOVO GENOME ASSEMBLY OF THE AFRICAN CATFISH (Clarias gariepinus)
    Kovacs, B.
    Barta, E.
    Pongor, S. L.
    Uri, C. S.
    Patocs, A.
    Orban, L.
    Muller, T.
    Urbanyi, B.
    AQUACULTURE, 2017, 472 : 105 - 105
  • [48] De novo genome assembly of the Edwardsiid anthozoan Edwardsia elegans
    Rutlekowski, Auston, I
    Modepalli, Vengamanaidu
    Ketchum, Remi
    Moran, Yehu
    Reitzel, Adam M.
    G3-GENES GENOMES GENETICS, 2025,
  • [49] De Novo Genome Assembly and Phylogenetic Analysis of Cirsium nipponicum
    Choi, Bae Young
    Kim, Jaewook
    Park, Hyeonseon
    Kim, Jincheol
    Han, Seahee
    Jo, Ick-Hyun
    Shim, Donghwan
    GENES, 2024, 15 (10)
  • [50] Next generation sequencing under de novo genome assembly
    Nimmy, Sonia Farhana
    Kamal, M. S.
    INTERNATIONAL JOURNAL OF BIOMATHEMATICS, 2015, 8 (05)