Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques

被引:10
|
作者
Ortuno, Francisco M. [1 ]
Valenzuela, Olga [2 ]
Pomares, Hector [1 ]
Rojas, Fernando [1 ]
Florido, Javier P. [3 ]
Urquiza, Jose M. [4 ]
Rojas, Ignacio [1 ]
机构
[1] Univ Granada UGR, Dept Comp Architecture & Comp Technol, Granada 18071, Spain
[2] Univ Granada UGR, Dept Appl Math, Granada 18071, Spain
[3] Andalusian Human Genome Sequencing Ctr CASEGH, Med Genome Project, Seville 41092, Spain
[4] Bellvitge Biomed Res Inst IDIBELL, Chromatin & Dis Grp, Barcelona 08907, Spain
关键词
MUTUAL INFORMATION; GENE ONTOLOGY; SELECTION;
D O I
10.1093/nar/gks919
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Multiple sequence alignments (MSAs) have become one of the most studied approaches in bioinformatics to perform other outstanding tasks such as structure prediction, biological function analysis or next-generation sequencing. However, current MSA algorithms do not always provide consistent solutions, since alignments become increasingly difficult when dealing with low similarity sequences. As widely known, these algorithms directly depend on specific features of the sequences, causing relevant influence on the alignment accuracy. Many MSA tools have been recently designed but it is not possible to know in advance which one is the most suitable for a particular set of sequences. In this work, we analyze some of the most used algorithms presented in the bibliography and their dependences on several features. A novel intelligent algorithm based on least square support vector machine is then developed to predict how accurate each alignment could be, depending on its analyzed features. This algorithm is performed with a dataset of 2180 MSAs. The proposed system first estimates the accuracy of possible alignments. The most promising methodologies are then selected in order to align each set of sequences. Since only one selected algorithm is run, the computational time is not excessively increased.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Performance analysis of computational approaches to solve Multiple Sequence Alignment
    Montanola, Alberto
    Roig, Concepcio
    Guirado, Fernando
    Hernandez, Porfidio
    Notredame, Cedric
    JOURNAL OF SUPERCOMPUTING, 2013, 64 (01): : 69 - 78
  • [42] A simulated annealing algorithm for multiple sequence alignment with guaranteed accuracy
    Huo, Hongwei
    Stojkovic, Vojislav
    ICNC 2007: THIRD INTERNATIONAL CONFERENCE ON NATURAL COMPUTATION, VOL 2, PROCEEDINGS, 2007, : 270 - +
  • [43] MAFFT version 5: improvement in accuracy of multiple sequence alignment
    Katoh, K
    Kuma, K
    Toh, H
    Miyata, T
    NUCLEIC ACIDS RESEARCH, 2005, 33 (02) : 511 - 518
  • [44] OXBench: A benchmark for evaluation of protein multiple sequence alignment accuracy
    Raghava, GPS
    Searle, SMJ
    Audley, PC
    Barber, JD
    Barton, GJ
    BMC BIOINFORMATICS, 2003, 4
  • [45] MUSCLE: multiple sequence alignment with high accuracy and high throughput
    Edgar, RC
    NUCLEIC ACIDS RESEARCH, 2004, 32 (05) : 1792 - 1797
  • [46] Accuracy Estimation and Parameter Advising for Protein Multiple Sequence Alignment
    Kececioglu, John
    DeBlasio, Dan
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2013, 20 (04) : 259 - 279
  • [47] Improvement in speed and accuracy of multiple sequence alignment program prime
    Waseda University, Computational Biology Research Center, Japan
    不详
    不详
    IPSJ Trans. Bioinformatics, 2008, (2-12):
  • [48] Bioinspired Algorithms for Multiple Sequence Alignment: A Systematic Review and Roadmap
    Ibrahim, Mohammed K.
    Yusof, Umi Kalsom
    Eisa, Taiseer Abdalla Elfadil
    Nasser, Maged
    APPLIED SCIENCES-BASEL, 2024, 14 (06):
  • [49] Performance analysis of computational approaches to solve Multiple Sequence Alignment
    Alberto Montañola
    Concepció Roig
    Fernando Guirado
    Porfidio Hernández
    Cedric Notredame
    The Journal of Supercomputing, 2013, 64 : 69 - 78
  • [50] Computational complexity of multiple sequence alignment with SP-Score
    Just, W
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2001, 8 (06) : 615 - 623