NcDNAlign:: Plausible multiple alignments of non-protein-coding genomic sequences

被引：8

作者：

Rose, Dominic

Hertel, Jana

Reiche, Kristin ^{[1
]}

Stadler, Peter F. ^{[1
,2
,3
]}

Hackermueller, Joerg ^{[1
]}

机构：

[1] Fraunhofer Inst Cell Therapy & Immunol IZI, D-04103 Leipzig, Germany

[2] Univ Vienna, Dept Theoret Chem, A-1090 Vienna, Austria

[3] Santa Fe Inst, Santa Fe, NM 87501 USA

来源：

GENOMICS | 2008年 / 92卷 / 01期

关键词：

non-coding RNA; ncRNA; alignment; multiple sequence alignments; ultra-conserved elements; ultra-conserved regions; UCE; UCR; CNE; genome annotation; comparative genomics;

D O I：

10.1016/j.ygeno.2008.04.003

中图分类号：

Q81 [生物工程学（生物技术）]; Q93 [微生物学];

学科分类号：

071005 ; 0836 ; 090102 ; 100705 ;

摘要：

Genome-wide multiple sequence alignments (MSAs) are a necessary prerequisite for an increasingly diverse collection of comparative genomic approaches. Here we present a versatile method that generates high-quality MSAS for non-protein-coding sequences. The NcDNAlign pipeline combines pairwise BLAST alignments to create initial MSAs, which are then locally improved and trimmed. The program is optimized for speed and hence is particulary well-suited to pilot studies. We demonstrate the practical use of NcDNAlign in three case studies: the search for ncRNAs in gammaproteobacteria and the analysis of conserved noncoding DNA in nematodes and teleost fish, in the latter case focusing on the fate of duplicated ultra-conserved regions. Compared to the currently widely used genome-wide alignment program TBA, our program results in a 20- to 30-fold reduction of CPU time necessary to generate gamma proteobacterial alignments. A showcase application of bacterial ncRNA prediction based on alignments of both algorithms results in similar sensitivity, false discovery rates,and up to 100 putatively novel ncRNA structures. Similar findings hold for our application Of NcDNAlign to the identification of ultra-conserved regions in nematodes and teleosts. Both approaches yield conserved sequences of unknown function, result in novel evolutionary insights into conservation patterns among these genomes, and manifest the benefits of an efficient and reliable genome-wide alignment package. The software is available under the GNU Public License at http://www.bioinf.uni-leipzig.de/Software/NcDNAlign/. (C) 2008 Elsevier Inc. All rights reserved.

引用

页码：65 / 74

页数：10

共 50 条

[1] Pathogenic variants in non-protein-coding sequences
Makrythanasis, P.
Antonarakis, S. E.
CLINICAL GENETICS, 2013, 84 (05) : 422 - 428
[2] Towards realistic benchmarks for multiple alignments of non-coding sequences
Jaebum Kim
Saurabh Sinha
BMC Bioinformatics, 11
[3] Towards realistic benchmarks for multiple alignments of non-coding sequences
Kim, Jaebum
Sinha, Saurabh
BMC BIOINFORMATICS, 2010, 11
[4] Expression of non-protein-coding antisense RNAs in genomic regions related to autism spectrum disorders
Dmitry Velmeshev
Marco Magistri
Mohammad Ali Faghihi
Molecular Autism, 4
[5] Expression of non-protein-coding antisense RNAs in genomic regions related to autism spectrum disorders
Velmeshev, Dmitry
Magistri, Marco
Faghihi, Mohammad Ali
MOLECULAR AUTISM, 2013, 4
[6] Identification of a Conserved Non-Protein-Coding Genomic Element that Plays an Essential Role in Alphabaculovirus Pathogenesis
Kikhno, Irina
PLOS ONE, 2014, 9 (04):
[7] Profiling of epididymal small non-protein-coding RNAs
Nixon, B.
De Iuliis, G. N.
Dun, M. D.
Zhou, W.
Trigg, N. A.
Eamens, A. L.
ANDROLOGY, 2019, 7 (05) : 669 - 680
[8] Multiple sequence alignments of partially coding nucleic acid sequences
Roman R Stocsits
Ivo L Hofacker
Claudia Fried
Peter F Stadler
BMC Bioinformatics, 6
[9] Multiple sequence alignments of partially coding nucleic acid sequences
Stocsits, RR
Hofacker, IL
Fried, C
Stadler, PF
BMC BIOINFORMATICS, 2005, 6 (1)
[10] The relationship between non-protein-coding DNA and eukaryotic complexity
Taft, Ryan J.
Pheasant, Michael
Mattick, John S.
BIOESSAYS, 2007, 29 (03) : 288 - 299

← 1 2 3 4 5 →