DNA sequence similarity search through content-based retrieval technique

被引:0
|
作者
Yeh, CH [1 ]
Sung, PY [1 ]
Chang, HT [1 ]
Kuo, CJ [1 ]
机构
[1] Univ So Calif, Dept Elect Engn, Los Angeles, CA 90007 USA
关键词
DNA; Peano scan; Fourier transformation; Principle Component Analysis; indexing structure; clustering algorithm; similarity retrieval;
D O I
10.1117/12.486714
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deoxyribonucleic acid (DNA) sequences are difficult to analyze similarity due to their length and complexity. The challenge lies in being able to use digital signal processing (DSP) to solve highly relevant problems in DNA sequences. Here, we transfer a one-dimensional (ID) DNA sequence into a two-dimensional (2D) pattern by using the Peano scan algorithm. Four complex values are assigned to the characters "A", ''C'', "T", and "G", respectively. Then, Fourier transform is employed to obtain far-field amplitude distribution of the 2D pattern. Hereto, a ID DNA sequence becomes a 2D image pattern. Features are extracted from the 2D image pattern with the Principle Component Analysis (PCA) method. Therefore, the DNA sequence database can be established. Unfortunately, comparing features may take a long time when the database is large since multi-dimensional features are often available. This problem is solved by building indexing structure like a filter to filter-out non-relevant items and select a subset of candidate DNA sequences. Clustering algorithms can organize the multi-dimensional feature data into the indexing structure for effective retrieval. Accordingly, the query sequence can be only compared against candidate ones rather than all sequences in database. In fact, our algorithm provides a pre-processing method to accelerate the DNA sequence search process. Finally, experimental results further demonstrate the efficiency of our proposed algorithm for DNA sequences similarity retrieval.
引用
收藏
页码:635 / 645
页数:11
相关论文
共 50 条
  • [1] Content-based video retrieval based on similarity of frame sequence
    Shan, MK
    Lee, SY
    [J]. INTERNATIONAL WORKSHOP ON MULTI-MEDIA DATABASE MANAGEMENT SYSTEMS- PROCEEDINGS, 1998, : 90 - 97
  • [2] A flexible search-by-similarity algorithm for content-based image retrieval
    Fournier, J
    Cord, M
    [J]. PROCEEDINGS OF THE 6TH JOINT CONFERENCE ON INFORMATION SCIENCES, 2002, : 672 - 675
  • [3] LOCALIZATION IN IMAGES MATCHING THROUGH REGION-BASED SIMILARITY TECHNIQUE FOR CONTENT-BASED IMAGE RETRIEVAL
    Memon, Muhammad Hammad
    Li, Jian-Ping
    Memon, Imran
    Arain, Qasim Ali
    Jamil, Sidra
    Memon, Muhammad Hunain
    [J]. 2016 13TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICCWAMTIP), 2016, : 269 - 272
  • [4] DNA Sequence Search Using Content-Based Image Search Approach
    Ramampiaro, Heri
    Grande, Aleksander
    [J]. 5TH INTERNATIONAL CONFERENCE ON PRACTICAL APPLICATIONS OF COMPUTATIONAL BIOLOGY & BIOINFORMATICS (PACBB 2011), 2011, 93 : 191 - 199
  • [5] Content-based copy retrieval using distortion-based probabilistic similarity search
    Joly, Alexis
    Buisson, Olivier
    Frelicot, Carl
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2007, 9 (02) : 293 - 306
  • [6] Content-Based retrieval supporting similarity query
    Yoon, MH
    Kim, KC
    Yoon, YI
    [J]. INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, PROCEEDINGS, 1999, : 218 - 224
  • [7] Content-based image retrieval using similarity
    Curry, RJ
    Marefat, MM
    Yang, F
    [J]. 2005 INTERNATIONAL CONFERENCE ON INTEGRATION OF KNOWLEDGE INTENSIVE MULTI-AGENT SYSTEMS: KIMAS'05: MODELING, EXPLORATION, AND ENGINEERING, 2005, : 629 - 634
  • [8] Content-based image retrieval by spatial similarity
    Kulkarni, AM
    Joshi, RC
    [J]. DEFENCE SCIENCE JOURNAL, 2002, 52 (03) : 285 - 291
  • [9] A Novel Technique for Region-Based Features Similarity for Content-Based Image Retrieval
    Memon, Imran
    Arain, Qasim Ali
    Pirzada, Nasrullah
    [J]. MEHRAN UNIVERSITY RESEARCH JOURNAL OF ENGINEERING AND TECHNOLOGY, 2018, 37 (02) : 383 - 396
  • [10] Content-based classification, search, and retrieval of audio
    Wold, E
    Blum, T
    Keislar, D
    Wheaton, J
    [J]. IEEE MULTIMEDIA, 1996, 3 (03) : 27 - 36