A De Novo Genome Assembly Algorithm for Repeats and Nonrepeats

被引:5
|
作者
Lian, Shuaibin [1 ]
Li, Qingyan [1 ]
Dai, Zhiming [1 ,2 ]
Xiang, Qian [1 ]
Dai, Xianhua [1 ]
机构
[1] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510006, Guangdong, Peoples R China
[2] SYSU CMU Shunde Int Joint Res Inst, Shunde 528300, Peoples R China
关键词
SEQUENCING TECHNOLOGIES; STRUCTURAL VARIATION; AMPLIFICATION; DNA; IDENTIFICATION;
D O I
10.1155/2014/736473
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background. Next generation sequencing platforms can generate shorter reads, deeper coverage, and higher throughput than those of the Sanger sequencing. These short reads may be assembled de novo before some specific genome analyses. Up to now, the performances of assembling repeats of these current assemblers are very poor. Results. To improve this problem, we proposed a new genome assembly algorithm, named SWA, which has four properties: (1) assembling repeats and nonrepeats; (2) adopting a new overlapping extension strategy to extend each seed; (3) adopting sliding window to filter out the sequencing bias; and (4) proposing a compensational mechanism for low coverage datasets. SWA was evaluated and validated in both simulations and real sequencing datasets. The accuracy of assembling repeats and estimating the copy numbers is up to 99% and 100%, respectively. Finally, the extensive comparisons with other eight leading assemblers show that SWA outperformed others in terms of completeness and correctness of assembling repeats and nonrepeats. Conclusions. This paper proposed a new de novo genome assembly method for resolving complex repeats. SWA not only can detect where repeats or nonrepeats are but also can assemble them completely from NGS data, especially for assembling repeats. This is the advantage over other assemblers.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] The sequence and de novo assembly of the giant panda genome
    Li, Ruiqiang
    Fan, Wei
    Tian, Geng
    Zhu, Hongmei
    He, Lin
    Cai, Jing
    Huang, Quanfei
    Cai, Qingle
    Li, Bo
    Bai, Yinqi
    Zhang, Zhihe
    Zhang, Yaping
    Wang, Wen
    Li, Jun
    Wei, Fuwen
    Li, Heng
    Jian, Min
    Li, Jianwen
    Zhang, Zhaolei
    Nielsen, Rasmus
    Li, Dawei
    Gu, Wanjun
    Yang, Zhentao
    Xuan, Zhaoling
    Ryder, Oliver A.
    Leung, Frederick Chi-Ching
    Zhou, Yan
    Cao, Jianjun
    Sun, Xiao
    Fu, Yonggui
    Fang, Xiaodong
    Guo, Xiaosen
    Wang, Bo
    Hou, Rong
    Shen, Fujun
    Mu, Bo
    Ni, Peixiang
    Lin, Runmao
    Qian, Wubin
    Wang, Guodong
    Yu, Chang
    Nie, Wenhui
    Wang, Jinhuan
    Wu, Zhigang
    Liang, Huiqing
    Min, Jiumeng
    Wu, Qi
    Cheng, Shifeng
    Ruan, Jue
    Wang, Mingwei
    NATURE, 2010, 463 (7279) : 311 - 317
  • [22] Scalable De Novo Genome Assembly Using Pregel
    Yan, Da
    Chen, Hongzhi
    Cheng, James
    Cai, Zhenkun
    Shao, Bin
    2018 IEEE 34TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2018, : 1216 - 1219
  • [23] Empirical evaluation of methods for de novo genome assembly
    Dida, Firaol
    Yi, Gangman
    PEERJ COMPUTER SCIENCE, 2021,
  • [24] The sequence and de novo assembly of hog deer genome
    Wang, Wei
    Yan, Hui-Juan
    Chen, Shi-Yi
    Li, Zhen-Zhen
    Yi, Jun
    Niu, Li-Li
    Deng, Jia-Po
    Chen, Wei-Gang
    Pu, Yang
    Jia, Xianbo
    Qu, Yu
    Chen, Ang
    Zhong, Yan
    Yu, Xin-Ming
    Pang, Shuai
    Huang, Wan-Long
    Han, Yue
    Liu, Guang-Jian
    Yu, Jian-Qiu
    SCIENTIFIC DATA, 2019, 6 (1)
  • [25] The sequence and de novo assembly of hog deer genome
    Wei Wang
    Hui-Juan Yan
    Shi-Yi Chen
    Zhen-Zhen Li
    Jun Yi
    Li-Li Niu
    Jia-Po Deng
    Wei-Gang Chen
    Yang Pu
    Xianbo Jia
    Yu Qu
    Ang Chen
    Yan Zhong
    Xin-Ming Yu
    Shuai Pang
    Wan-Long Huang
    Yue Han
    Guang-Jian Liu
    Jian-Qiu Yu
    Scientific Data, 6
  • [26] Hybrid de novo genome assembly of a Kazakh individual
    Karabayev, Daniyar
    Daniyarov, Asset
    Molkenov, Askhat
    Rakhimova, Saule
    Samatkyzy, Diana
    Gabdulkayum, Aidana
    Kozhamkulov, Ulan
    Akilzhanova, Ainur
    Kairov, Ulykbek
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2023, 31 : 613 - 613
  • [27] De Novo Assembly of an Allotetraploid Artemisia argyi Genome
    Mei, Qiming
    Li, Hanxiang
    Liu, Yanbin
    Wu, Feng
    Liu, Chuang
    Wang, Keya
    Liu, Hongjun
    Peng, Cheng
    Wang, Zhengfeng
    Wang, Long
    Liu, Zhanfeng
    Yan, Junhua
    Zhang, Wei
    AGRONOMY-BASEL, 2023, 13 (02):
  • [28] De novo assembly and annotation of the Ganoderma australe genome
    Agudelo-Valencia, Daniel
    Uribe-Echeverry, Paula Tatiana
    Betancur-Perez, John Fredy
    GENOMICS, 2020, 112 (01) : 930 - 933
  • [29] De novo assembly and phasing of a Korean human genome
    Seo, Jeong-Sun
    Rhie, Arang
    Kim, Junsoo
    Lee, Sangjin
    Sohn, Min-Hwan
    Kim, Chang-Uk
    Hastie, Alex
    Cao, Han
    Yun, Ji-Young
    Kim, Jihye
    Kuk, Junho
    Park, Gun Hwa
    Kim, Juhyeok
    Ryu, Hanna
    Kim, Jongbum
    Roh, Mira
    Baek, Jeonghun
    Hunkapiller, Michael W.
    Korlach, Jonas
    Shin, Jong-Yeon
    Kim, Changhoon
    NATURE, 2016, 538 (7624) : 243 - +
  • [30] De novo assembly and annotation of the mangrove cricket genome
    Aya Satoh
    Miwako Takasu
    Kentaro Yano
    Yohey Terai
    BMC Research Notes, 14