Novel methodologies for spectral classification of exon and intron sequences

被引:14
|
作者
Kwan, Hon Keung [1 ]
Kwan, Benjamin Y. M. [2 ]
Kwan, Jennifer Y. Y. [3 ]
机构
[1] Univ Windsor, Dept Elect & Comp Engn, Windsor, ON N9B 3P4, Canada
[2] Univ Ottawa, Fac Med, Ottawa, ON K1H 8M5, Canada
[3] Queens Univ, Sch Med, Kingston, ON K7L 3N6, Canada
来源
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING | 2012年
关键词
DNA sequence; numerical representation; nucleotide to numeric mapping; exon and intron sequences; coding and non-coding sequences; threshold value; thresholding; exon and intron classification; period-3; spectral analysis; discrete Fourier transform; gene detection; genome annotation; DNA; REPRESENTATION; PREDICTION; GALAXY;
D O I
10.1186/1687-6180-2012-50
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Digital processing of a nucleotide sequence requires it to be mapped to a numerical sequence in which the choice of nucleotide to numeric mapping affects how well its biological properties can be preserved and reflected from nucleotide domain to numerical domain. Digital spectral analysis of nucleotide sequences unfolds a period-3 power spectral value which is more prominent in an exon sequence as compared to that of an intron sequence. The success of a period-3 based exon and intron classification depends on the choice of a threshold value. The main purposes of this article are to introduce novel codes for 1-sequence numerical representations for spectral analysis and compare them to existing codes to determine appropriate representation, and to introduce novel thresholding methods for more accurate period-3 based exon and intron classification of an unknown sequence. The main findings of this study are summarized as follows: Among sixteen 1-sequence numerical representations, the K-Quaternary Code I offers an attractive performance. A windowed 1-sequence numerical representation (with window length of 9, 15, and 24 bases) offers a possible speed gain over non-windowed 4-sequence Voss representation which increases as sequence length increases. A winner threshold value (chosen from the best among two defined threshold values and one other threshold value) offers a top precision for classifying an unknown sequence of specified fixed lengths. An interpolated winner threshold value applicable to an unknown and arbitrary length sequence can be estimated from the winner threshold values of fixed length sequences with a comparable performance. In general, precision increases as sequence length increases. The study contributes an effective spectral analysis of nucleotide sequences to better reveal embedded properties, and has potential applications in improved genome annotation.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] ANALYSIS OF 5' FLANKING SEQUENCES AND INTRON-EXON BOUNDARIES OF THE RAT PROLACTIN GENE
    MAURER, RA
    ERWIN, CR
    DONELSON, JE
    JOURNAL OF BIOLOGICAL CHEMISTRY, 1981, 256 (20) : 524 - 528
  • [32] How are exons encoding transmembrane sequences distributed in the exon-intron structure of genes?
    Sawada, Ryusuke
    Mitaku, Shigeki
    GENES TO CELLS, 2011, 16 (01) : 115 - 121
  • [33] BETA-GLOBIN TRANSCRIPTS CARRYING A SINGLE INTRON WITH 3 ADJACENT NUCLEOTIDES OF 5' EXON ARE EFFICIENTLY SPLICED INVITRO IRRESPECTIVE OF INTRON POSITION OR SURROUNDING EXON SEQUENCES
    MAYEDA, A
    OHSHIMA, Y
    NUCLEIC ACIDS RESEARCH, 1990, 18 (16) : 4671 - 4676
  • [34] A SURVEY ON INTRON AND EXON LENGTHS
    HAWKINS, JD
    NUCLEIC ACIDS RESEARCH, 1988, 16 (21) : 9893 - 9908
  • [35] ExInt: an Exon Intron Database
    Sakharkar, M
    Passetti, F
    de Souza, JE
    Long, M
    de Souza, SJ
    NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 191 - 194
  • [36] Influence of Exon Duplication on Intron and Exon Phase Distribution
    Fedorov, A.
    Fedorova, L.
    Starshenko, V.
    Filatov, V.
    Journal of Molecular Evolution, 46 (03):
  • [37] Influence of exon duplication on intron and exon phase distribution
    Fedorov, A
    Fedorova, L
    Starshenko, V
    Filatov, V
    Grigor'ev, E
    JOURNAL OF MOLECULAR EVOLUTION, 1998, 46 (03) : 263 - 271
  • [38] ExInt: an Exon/Intron database
    Sakharkar, M
    Long, M
    Tan, TW
    de Souza, SJ
    NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 191 - 192
  • [39] EXON AND INTRON SEQUENCES, RESPECTIVELY, REPRESS AND ACTIVATE SPLICING OF A FIBROBLAST GROWTH-FACTOR RECEPTOR-2 ALTERNATIVE EXON
    DELGATTO, F
    BREATHNACH, R
    MOLECULAR AND CELLULAR BIOLOGY, 1995, 15 (09) : 4825 - 4834
  • [40] Influence of Exon Duplication on Intron and Exon Phase Distribution
    Alexey Fedorov
    Larisa Fedorova
    Valery Starshenko
    Vadim Filatov
    Eugeni Grigor'ev
    Journal of Molecular Evolution, 1998, 46 : 263 - 271