HAlign 3: Fast Multiple Alignment of Ultra-Large Numbers of Similar DNA/RNA Sequences

被引:16
|
作者
Tang, Furong [1 ,2 ]
Chao, Jiannan [1 ,3 ]
Wei, Yanming [4 ]
Yang, Fenglong [5 ]
Zhai, Yixiao [3 ]
Xu, Lei [2 ]
Zou, Quan [1 ,3 ]
机构
[1] Univ Elect Sci & Technol China, Yangtze Delta Reg Inst Quzhou, Quzhou, Peoples R China
[2] Shenzhen Polytech, Sch Elect & Commun Engn, Shenzhen, Peoples R China
[3] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
[4] Xidian Univ, Sch Comp Sci & Technol, Xian, Peoples R China
[5] Fujian Med Univ, Sch Med Technol & Engn, Fuzhou, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
multiple sequence alignment; suffix tree; center star strategy; common substring; substring selection; ALGORITHM;
D O I
10.1093/molbev/msac166
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
HAlign is a cross-platform program that performs multiple sequence alignments based on the center star strategy. Here we present two major updates of HAlign 3, which helped improve the time efficiency and the alignment quality, and made HAlign 3 a specialized program to process ultra-large numbers of similar DNA/RNA sequences, such as closely related viral or prokaryotic genomes. HAlign 3 can be easily installed via the Anaconda and Java release package on macOS, Linux, Windows subsystem for Linux, and Windows systems, and the source code is available on GitHub (https://github.com/malabz/HAlign-3).
引用
收藏
页数:5
相关论文
共 24 条
  • [1] HAlign: Fast multiple similar DNA/RNA sequence alignment based on the centre star strategy
    Zou, Quan
    Hu, Qinghua
    Guo, Maozu
    Wang, Guohua
    [J]. BIOINFORMATICS, 2015, 31 (15) : 2475 - 2481
  • [2] HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing
    Wan, Shixiang
    Zou, Quan
    [J]. ALGORITHMS FOR MOLECULAR BIOLOGY, 2017, 12
  • [3] HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing
    Shixiang Wan
    Quan Zou
    [J]. Algorithms for Molecular Biology, 12
  • [4] PASTA: Ultra-Large Multiple Sequence Alignment
    Mirarab, Siavash
    Nguyen, Nam
    Warnow, Tandy
    [J]. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY, RECOMB2014, 2014, 8394 : 177 - 191
  • [5] SaAlign: Multiple DNA/RNA sequence alignment and phylogenetic tree construction tool for ultra-large datasets and ultra-long sequences based on suffix array
    Wang, Ziyuan
    Tan, Junjie
    Long, Yanling
    Liu, Yijia
    Lei, Wenyan
    Cai, Jing
    Yang, Yi
    Liu, Zhibin
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 : 1487 - 1493
  • [6] PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences
    Mirarab, Siavash
    Nam Nguyen
    Guo, Sheng
    Wang, Li-San
    Kim, Junhyong
    Warnow, Tandy
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2015, 22 (05) : 377 - 386
  • [7] Reducing Alignment Time Complexity of Ultra-Large Sets of Sequences
    Rubio-Largo, Alvaro
    Vanneschi, Leonardo
    Castelli, Mauro
    Vega-Rodriguez, Miguel A.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2017, 24 (11) : 1144 - 1152
  • [8] Fast and sensitive multiple alignment of large genomic sequences
    Michael Brudno
    Michael Chapman
    Berthold Göttgens
    Serafim Batzoglou
    Burkhard Morgenstern
    [J]. BMC Bioinformatics, 4
  • [9] A fast structural multiple alignment method for long RNA sequences
    Yasuo Tabei
    Hisanori Kiryu
    Taishin Kin
    Kiyoshi Asai
    [J]. BMC Bioinformatics, 9
  • [10] A fast structural multiple alignment method for long RNA sequences
    Tabei, Yasuo
    Kiryu, Hisanori
    Kin, Taishin
    Asai, Kiyoshi
    [J]. BMC BIOINFORMATICS, 2008, 9 (1)