Novel methodologies for spectral classification of exon and intron sequences

被引:14
|
作者
Kwan, Hon Keung [1 ]
Kwan, Benjamin Y. M. [2 ]
Kwan, Jennifer Y. Y. [3 ]
机构
[1] Univ Windsor, Dept Elect & Comp Engn, Windsor, ON N9B 3P4, Canada
[2] Univ Ottawa, Fac Med, Ottawa, ON K1H 8M5, Canada
[3] Queens Univ, Sch Med, Kingston, ON K7L 3N6, Canada
来源
EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING | 2012年
关键词
DNA sequence; numerical representation; nucleotide to numeric mapping; exon and intron sequences; coding and non-coding sequences; threshold value; thresholding; exon and intron classification; period-3; spectral analysis; discrete Fourier transform; gene detection; genome annotation; DNA; REPRESENTATION; PREDICTION; GALAXY;
D O I
10.1186/1687-6180-2012-50
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Digital processing of a nucleotide sequence requires it to be mapped to a numerical sequence in which the choice of nucleotide to numeric mapping affects how well its biological properties can be preserved and reflected from nucleotide domain to numerical domain. Digital spectral analysis of nucleotide sequences unfolds a period-3 power spectral value which is more prominent in an exon sequence as compared to that of an intron sequence. The success of a period-3 based exon and intron classification depends on the choice of a threshold value. The main purposes of this article are to introduce novel codes for 1-sequence numerical representations for spectral analysis and compare them to existing codes to determine appropriate representation, and to introduce novel thresholding methods for more accurate period-3 based exon and intron classification of an unknown sequence. The main findings of this study are summarized as follows: Among sixteen 1-sequence numerical representations, the K-Quaternary Code I offers an attractive performance. A windowed 1-sequence numerical representation (with window length of 9, 15, and 24 bases) offers a possible speed gain over non-windowed 4-sequence Voss representation which increases as sequence length increases. A winner threshold value (chosen from the best among two defined threshold values and one other threshold value) offers a top precision for classifying an unknown sequence of specified fixed lengths. An interpolated winner threshold value applicable to an unknown and arbitrary length sequence can be estimated from the winner threshold values of fixed length sequences with a comparable performance. In general, precision increases as sequence length increases. The study contributes an effective spectral analysis of nucleotide sequences to better reveal embedded properties, and has potential applications in improved genome annotation.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] A novel cryptic exon in intron 2 of the human dystrophin gene evolved from an intron by acquiring consensus sequences for splicing at different stages of anthropoid evolution
    Pramono, ZAD
    Takeshima, Y
    Surono, A
    Ishida, T
    Matsuo, M
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2000, 267 (01) : 321 - 328
  • [22] Freshwater sponge silicateins: Comparison of gene sequences and exon-intron structure
    Kalyuzhnaya, O. V.
    Krasko, A. G.
    Grebenyuk, V. A.
    Itskovich, V. B.
    Semiturkina, N. A.
    Solovarov, I. S.
    Mueller, W. E. G.
    Belikov, S. I.
    MOLECULAR BIOLOGY, 2011, 45 (04) : 567 - 575
  • [23] Freshwater sponge silicateins: Comparison of gene sequences and exon-intron structure
    O. V. Kalyuzhnaya
    A. G. Krasko
    V. A. Grebenyuk
    V. B. Itskovich
    N. A. Semiturkina
    I. S. Solovarov
    W. E. G. Mueller
    S. I. Belikov
    Molecular Biology, 2011, 45 : 567 - 575
  • [24] Representation of DNA sequences in genetic codon context with applications in exon and intron prediction
    Yin, Changchuan
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2015, 13 (02)
  • [25] BICOLOR FLUORESCENCE INSITU HYBRIDIZATION TO INTRON AND EXON MESSENGER-RNA SEQUENCES
    RAAP, AK
    VANDERIJKE, FM
    DIRKS, RW
    SOL, CJ
    BOOM, R
    VANDERPLOEG, M
    EXPERIMENTAL CELL RESEARCH, 1991, 197 (02) : 319 - 322
  • [26] METHODOLOGIES FOR SPECIFIC INTRON AND EXON RNA LOCALIZATION IN CULTURED-CELLS BY HAPTENIZED AND FLUOROCHROMIZED PROBES
    DIRKS, RW
    VANDERIJKE, FM
    FUJISHITA, S
    VANDERPLOEG, M
    RAAP, AK
    JOURNAL OF CELL SCIENCE, 1993, 104 : 1187 - 1197
  • [27] ALTERNATIVE SPLICING OF EXON-2 IN HUMAN AMPD1 IS DEPENDENT ON INTERACTION WITH FLANKING INTRON EXON SEQUENCES
    NEWBY, K
    MINEO, I
    HOLMES, E
    CLINICAL RESEARCH, 1991, 39 (02): : A318 - A318
  • [28] Intron 2 and exon 3 sequences may be involved in the susceptibility to develop Takayasu arteritis
    Rodríguez-Reyna, TS
    Zúñiga-Ramos, J
    Salgado, N
    Hernández-Martínez, B
    Vargas-Alarcón, G
    Reyes-López, PA
    Granados, J
    INTERNATIONAL JOURNAL OF CARDIOLOGY, 1998, 66 : S135 - S138
  • [29] Identification of intron and exon sequences involved in alternative splicing of the insulin receptor mRNA.
    Webster, NJG
    Nelson, JG
    Kosaki, A
    DIABETES, 1996, 45 : 831 - 831
  • [30] GENERATION OF THE HLA-A*80:01 ALLELE BASED ON THE COMPLETE EXON/INTRON SEQUENCES
    Cervera, Isabel
    Angel Herraiz, Miguel
    Antonio Vidart, Jose
    Ortega, Sofia
    Martinez-Laso, Jorge
    TISSUE ANTIGENS, 2014, 84 (01): : 116 - 116