A Novel Fast and Memory Efficient Parallel MLCS Algorithm for Long and Large-Scale Sequences Alignments

被引:0
|
作者
Li, Yanni [1 ]
Wang, Yuping [2 ]
Zhang, Zhensong [3 ]
Wang, Yaxin [4 ]
Ma, Ding [5 ]
Huang, Jianbin [6 ]
机构
[1] Xidian Univ, Sch Software, Xian, Shaanxi, Peoples R China
[2] Xidian Univ, Sch Comp Sci & Technol, Xian, Shaanxi, Peoples R China
[3] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
[4] Univ Calif Los Angeles, Henry Samueli Sch Engn & Appl Sci, Los Angeles, CA USA
[5] Univ Southern Calif, Viterbi Sch Engn, Dept Comp Sci, Los Angeles, CA 90089 USA
[6] Xidian Univ, Sch Software, Xian, Shaanxi, Peoples R China
关键词
Multiple Longest Common Subsequences (MLCS); Irredundant Common Subsequence Graph (ICSG); Parallel Collection Chain (PCC); ICSG-PCC Model; Parallel Algorithm; COMMON SUBSEQUENCE;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Information usually can be abstracted as a character sequence over a finite alphabet. With the advent of the era of big data, the increasing length and size of the sequences from various application fields (e.g., biological sequences) result in the classical NP-hard problem, searching for the Multiple Longest Common Subsequences of multiple sequences (i.e., MLCS problem with many applications in the areas of bioinformatics, computational genomics, pattern recognition, etc.), becoming a research hotspot and facing severe challenges. In this paper, we firstly reveal that the leading dominant-point-based MLCS algorithms are very hard to apply to long and large-scale sequences alignments. To overcome their defects, based on the proposed problem-solving model and parallel topological sorting strategies, we present a novel efficient parallel MLCS algorithm. The comprehensive experiments on the benchmark datasets of both random and biological sequences demonstrate that both the time and space complexities of the proposed algorithm are only linearly related to the dominants from aligned sequences, and that the proposed algorithm greatly outperforms the existing state-of-the-art dominant-point-based MLCS algorithms, and hence it is very suitable for long and large-scale sequences alignments.
引用
收藏
页码:1170 / 1181
页数:12
相关论文
共 50 条
  • [1] A fast and memory efficient MLCS algorithm by character merging for DNA sequences alignment
    Liu, Sen
    Wang, Yuping
    Tong, Wuning
    Wei, Shiwei
    [J]. BIOINFORMATICS, 2020, 36 (04) : 1066 - 1073
  • [2] An Efficient Parallel Multilevel Fast Multipole Algorithm for Large-scale Scattering Problems
    Hu Fangjing
    Nie Zaiping
    Hu Jun
    [J]. APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2010, 25 (04): : 381 - 387
  • [3] Parallel fast algorithm for large-scale electromagnetic scattering
    Wu, F
    Zhang, YJ
    Oo, ZZ
    Li, EP
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND GRID IN ASIA PACIFIC REGION, PROCEEDINGS, 2004, : 188 - 194
  • [4] A branch and bound irredundant graph algorithm for large-scale MLCS problems
    Wang, Chunyang
    Wang, Yuping
    Cheung, Yiuming
    [J]. PATTERN RECOGNITION, 2021, 119
  • [5] Automatic analysis of large-scale pairwise alignments of protein sequences
    Codani, JJ
    Comet, JP
    Aude, JC
    Glémet, E
    Wozniak, A
    Risler, JL
    Hénaut, A
    Slonimski, PP
    [J]. METHODS IN MICROBIOLOGY, VOL 28, 1999, 28 : 229 - 244
  • [6] A novel parallel algorithm for large-scale Fock matrix construction with small locally distributed memory architectures:: RT parallel algorithm
    Takashima, H
    Yamada, S
    Obara, S
    Kitamura, K
    Inabata, S
    Miyakawa, N
    Tanabe, K
    Nagashima, U
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2002, 23 (14) : 1337 - 1346
  • [7] Fast and accurate solutions of large-scale scattering problems with parallel multilevel fast multipole algorithm
    Ergul, Ozgur
    Gurel, Levent
    [J]. 2007 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM, VOLS 1-12, 2007, : 3170 - 3173
  • [8] Fast and accurate analysis of large-scale composite structures with the parallel multilevel fast multipole algorithm
    Ergul, Ozgur
    Gurel, Levent
    [J]. JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2013, 30 (03) : 509 - 517
  • [9] An Efficient Parallel FE-BI Algorithm for Large-scale Scattering Problems
    Fan, Z. H.
    Chen, M.
    Chen, R. S.
    Ding, D. Z.
    [J]. APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2011, 26 (10): : 831 - 840
  • [10] A Parallel Rendering Algorithm for Large-scale Terrain
    Bing, He
    Lei, Sui
    [J]. 2010 SECOND INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND APPLICATIONS: ICCEA 2010, PROCEEDINGS, VOL 1, 2010, : 530 - 536