A Novel Efficient Graph Model for the Multiple Longest Common Subsequences (MLCS) Problem

被引:15
|
作者
Peng, Zhan [1 ]
Wang, Yuping [1 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Shaanxi, Peoples R China
来源
FRONTIERS IN GENETICS | 2017年 / 8卷
基金
中国国家自然科学基金;
关键词
multiple longest common subsequences; longest common subsequence; dominant point method; directed acyclic graph; biological sequence alignment; ALGORITHM; SEQUENCES;
D O I
10.3389/fgene.2017.00104
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Searching for the Multiple Longest Common Subsequences (MLCS) of multiple sequences is a classical NP-hard problem, which has been used in many applications. One of the most effective exact approaches for the MLCS problem is based on dominant point graph, which is a kind of directed acyclic graph (DAG). However, the time and space efficiency of the leading dominant point graph based approaches is still unsatisfactory: constructing the dominated point graph used by these approaches requires a huge amount of time and space, which hinders the applications of these approaches to large-scale and long sequences. To address this issue, in this paper, we propose a new time and space efficient graph model called the Leveled-DAG for the MLCS problem. The Leveled-DAG can timely eliminate all the nodes in the graph that cannot contribute to the construction of MLCS during constructing. At any moment, only the current level and some previously generated nodes in the graph need to be kept in memory, which can greatly reduce the memory consumption. Also, the final graph contains only one node in which all of the wanted MLCS are saved, thus, no additional operations for searching the MLCS are needed. The experiments are conducted on real biological sequences with different numbers and lengths respectively, and the proposed algorithm is compared with three state-of-the-art algorithms. The experimental results show that the time and space needed for the Leveled-DAG approach are smaller than those for the compared algorithms especially on large-scale and long sequences.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] A Real Linear and Parallel Multiple Longest Common Subsequences (MLCS) Algorithm
    Li, Yanni
    Li, Hui
    Duan, Tihua
    Wang, Sheng
    Wang, Zhi
    Cheng, Yang
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 1725 - 1734
  • [2] A path recorder algorithm for Multiple Longest Common Subsequences (MLCS) problems
    Wei, Shiwei
    Wang, Yuping
    Yang, Yuanchao
    Liu, Sen
    [J]. BIOINFORMATICS, 2020, 36 (10) : 3035 - 3042
  • [3] Efficient Dominant Point Algorithms for the Multiple Longest Common Subsequence (MLCS) Problem
    Wang, Qingguo
    Korkin, Dmitry
    Shang, Yi
    [J]. 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, 2009, : 1494 - 1499
  • [4] A New Progressive Algorithm for a Multiple Longest Common Subsequences Problem and Its Efficient Parallelization
    Yang, Jiaoyun
    Xu, Yun
    Sun, Guangzhong
    Shang, Yi
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (05) : 862 - 870
  • [5] Dynamic-MLCS: Fast searching for dynamic multiple longest common subsequences in sequence stream data
    Fu, Yuanyuan
    Wang, Chunyang
    Zhu, Jixin
    Zhang, Qun
    Cheung, Yiuming
    Wang, Yuping
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 293
  • [6] Efficient Computation of Longest Common Subsequences with Multiple Substring Inclusive Constraints
    Wang, Xiaodong
    Wang, Lei
    Zhu, Daxin
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2019, 26 (09) : 938 - 947
  • [7] Efficient computation of all longest common subsequences
    Rick, C
    [J]. ALGORITHM THEORY - SWAT 2000, 2000, 1851 : 407 - 418
  • [8] A Fast Multiple Longest Common Subsequence (MLCS) Algorithm
    Wang, Qingguo
    Korkin, Dmitry
    Shang, Yi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2011, 23 (03) : 321 - 334
  • [9] A dominant point-based parallel algorithm that finds all longest common subsequences for a constrained-MLCS problem
    Ngomade, Armel Nkonjoh
    Myoupo, Jean Frederic
    Tchendji, Vianney Kengne
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2020, 40
  • [10] An Efficient Algorithm for Enumerating Longest Common Increasing Subsequences
    Lin, Chun
    Huang, Chao-Yuan
    Tsai, Ming-Jer
    [J]. COMPUTING AND COMBINATORICS (COCOON 2021), 2021, 13025 : 25 - 36