LPC-VQ based hidden Markov models for similarity searching in DNA sequences

被引:1
|
作者
Pham, Tuan D. [1 ]
Yan, Hong [2 ,3 ]
机构
[1] James Cook Univ N Queensland, Bioinformat Applicat Res Ctr, Townsville, Qld 4811, Australia
[2] City Univ Hong Kong, Dept Elect Engn, Hong Kong, Peoples R China
[3] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW 2006, Australia
关键词
D O I
10.1109/ICSMC.2006.384956
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Given a newly found gene of some particular genome and a database of sequences whose functions have been known, it must be very helpful if we can search through the database and identify those that are similar to the particular new sequence. The search results may help us to understand the functional role, regulation, and expression of the new gene by the inference from the similar database sequences. This is the task of any methods developed for biological database searching. In this paper we present a new application of the theories of linear predictive coding, vector quantization, and hidden Markov models to address the problem of DNA sequence similarity search where there is no need for sequence alignment. The proposed approach has been tested and compared with some existing methods against real DNA and genomic datasets. The experimental results demonstrate its potential use for such purpose.
引用
收藏
页码:1654 / +
页数:2
相关论文
共 50 条
  • [1] Similarity-based clustering of sequences using hidden Markov models
    Bicego, M
    Murino, V
    Figueiredo, MAT
    MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, PROCEEDINGS, 2003, 2734 : 86 - 95
  • [2] Similarity-based classification of sequences using hidden Markov models
    Bicego, M
    Murino, V
    Figueiredo, MAT
    PATTERN RECOGNITION, 2004, 37 (12) : 2281 - 2291
  • [3] Detecting homogeneous segments in DNA sequences by using hidden Markov models
    Boys, RJ
    Henderson, DA
    Wilkinson, DJ
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2000, 49 : 269 - 285
  • [4] Higher-order hidden Markov models with applications to DNA sequences
    Ching, WK
    Fung, ES
    Ng, MK
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, 2003, 2690 : 535 - 539
  • [5] Clustering sequences with hidden Markov models
    Smyth, P
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 9: PROCEEDINGS OF THE 1996 CONFERENCE, 1997, 9 : 648 - 654
  • [6] A New Measure for Similarity Searching in DNA Sequences
    Zhang, Yusen
    Chen, Wei
    MATCH-COMMUNICATIONS IN MATHEMATICAL AND IN COMPUTER CHEMISTRY, 2011, 65 (02) : 477 - 488
  • [7] Effective hidden Markov models for detecting splicing junction sites in DNA sequences
    Yin, MM
    Wang, JTL
    INFORMATION SCIENCES, 2001, 139 (1-2) : 139 - 163
  • [8] Applications of hidden Markov models for characterization of homologous DNA sequences with a common gene
    Hobolth, A
    Jensen, JL
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2005, 12 (02) : 186 - 203
  • [9] Exploitation of unlabeled sequences in hidden Markov models
    Inoue, M
    Ueda, N
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2003, 25 (12) : 1570 - 1581
  • [10] Training Hidden Markov Models on Incomplete Sequences
    Popov, Alexander A.
    Gultyaeva, Tatyana A.
    Uvarov, Vadim E.
    2016 13TH INTERNATIONAL SCIENTIFIC-TECHNICAL CONFERENCE ON ACTUAL PROBLEMS OF ELECTRONIC INSTRUMENT ENGINEERING (APEIE), VOL 2, 2016, : 317 - 320