Towards index-based similarity search for protein structure databases

被引:12
|
作者
Çamoglu, O [1 ]
Kahveci, T [1 ]
Singh, AK [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
关键词
protein structures; feature vectors; indexing; dataset join;
D O I
10.1109/CSB.2003.1227314
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We propose two methods for finding similarities in protein structure databases. Our techniques extract feature vectors on triplets of SSEs (Secondary Structure Elements) of proteins. These feature vectors are then indexed using a multidimensional index structure. Our first technique cone siders the problem of finding proteins similar to a given query protein in a protein dataset. This technique quickly finds promising proteins using the index structure. These proteins are then aligned to the query protein using a popular pairwise alignment tool such as VAST We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Our second technique considers the problem of joining two protein datasets to find an all-to-all similarity. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while keeping the sensitivity similar.
引用
收藏
页码:148 / 158
页数:11
相关论文
共 50 条
  • [1] Index-Based Approach to Similarity Search in Protein and Nucleotide Databases
    Hoksza, David
    Skopal, Tomas
    DATESO 2007 - DATABASES, TEXTS, SPECIFICATIONS, OBJECTS: PROCEEDINGS OF THE 7TH ANNUAL INTERNATIONAL WORKSHOP, 2007, 235 : 67 - 80
  • [2] An index-based approach for similarity search supporting time warping in large sequence databases
    Kim, SW
    Park, S
    Chu, WW
    17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, : 607 - 614
  • [3] Efficient processing of similarity search under time warping in sequence databases: an index-based approach
    Kim, SW
    Park, S
    Chu, WW
    INFORMATION SYSTEMS, 2004, 29 (05) : 405 - 420
  • [4] An adaptive index structure for similarity search in large image databases
    Wu, P
    Manjunath, BS
    INTERNET MULTIMEDIA MANAGEMENT SYSTEMS II, 2001, 4519 : 32 - 41
  • [5] AN INDEX-BASED APPROACH TO QUERY MAMMOGRAPHIC DATABASES
    Valente, Frederico
    Bastiao, Luis
    Silva, Augusto
    ICEM15: 15TH INTERNATIONAL CONFERENCE ON EXPERIMENTAL MECHANICS, 2012,
  • [6] PICS: Parallel Index-based Search Algorithm for Coalition Structure Generation
    Taguelmimt, Redha
    Aknine, Samir
    Boukredera, Djamila
    2022 IEEE 34TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2022, : 739 - 746
  • [7] An efficient index-based protein structure database searching method
    Aung, ZY
    Fu, W
    Tan, KL
    EIGHTH INTERNATIONAL CONFERENCE ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2003, : 311 - 318
  • [8] Efficient similarity search in protein structure databases by k-clique hashing
    Weskamp, N
    Kuhn, D
    Hüllermeier, E
    Klebe, G
    BIOINFORMATICS, 2004, 20 (10) : 1522 - 1526
  • [9] Index-Based R-S Similarity Joins
    Pearson, Spencer S.
    Silva, Yasin N.
    SIMILARITY SEARCH AND APPLICATIONS, 2014, 8821 : 106 - 112
  • [10] Structator: fast index-based search for RNA sequence-structure patterns
    Fernando Meyer
    Stefan Kurtz
    Rolf Backofen
    Sebastian Will
    Michael Beckstette
    BMC Bioinformatics, 12