Towards index-based similarity search for protein structure databases

被引:12
|
作者
Çamoglu, O [1 ]
Kahveci, T [1 ]
Singh, AK [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
关键词
protein structures; feature vectors; indexing; dataset join;
D O I
10.1109/CSB.2003.1227314
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We propose two methods for finding similarities in protein structure databases. Our techniques extract feature vectors on triplets of SSEs (Secondary Structure Elements) of proteins. These feature vectors are then indexed using a multidimensional index structure. Our first technique cone siders the problem of finding proteins similar to a given query protein in a protein dataset. This technique quickly finds promising proteins using the index structure. These proteins are then aligned to the query protein using a popular pairwise alignment tool such as VAST We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Our second technique considers the problem of joining two protein datasets to find an all-to-all similarity. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while keeping the sensitivity similar.
引用
收藏
页码:148 / 158
页数:11
相关论文
共 50 条
  • [21] Hybrid index-based image search from the web
    Gupta, Rahul
    Ghosh, S. K.
    Sural, Shamik
    Pramanik, Sakti
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2011, 3 (03) : 252 - 276
  • [22] Fast similarity search in three-dimensional structure databases
    Wang, X
    Wang, JTL
    JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES, 2000, 40 (02): : 442 - 451
  • [23] Similarity search in trajectory Databases
    Pelekis, Nikos
    Kopanakis, Ioannis
    Marketos, Gerasimos
    Ntoutsi, Irene
    Andrienko, Gennady
    Theodoridis, Yannis
    TIME 2007: 14TH INTERNATIONAL SYMPOSIUM ON TEMPORAL REPRESENTATION AND REASONING, PROCEEDINGS, 2007, : 129 - +
  • [24] Similarity search in multimedia databases
    Keim, DA
    Bustos, B
    20TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2004, : 873 - 873
  • [25] Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns
    Fernando Meyer
    Stefan Kurtz
    Michael Beckstette
    BMC Bioinformatics, 14
  • [26] Fast online and index-based algorithms for approximate search of RNA sequence-structure patterns
    Meyer, Fernando
    Kurtz, Stefan
    Beckstette, Michael
    BMC BIOINFORMATICS, 2013, 14
  • [27] Index-Based Densest Clique Percolation Community Search in Networks
    Yuan, Long
    Qin, Lu
    Zhang, Wenjie
    Chang, Lijun
    Yang, Jianye
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (05) : 922 - 935
  • [28] Index-based Densest Clique Percolation Community Search in Networks
    Yuan, Long
    Qin, Lu
    Zhang, Wenjie
    Chang, Lijun
    Yang, Jianye
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 2161 - 2162
  • [29] Index-based fast search algorithm of image database on internet
    Yeh, CH
    Kuo, CJ
    2000 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, PROCEEDINGS VOLS I-III, 2000, : 1195 - 1198
  • [30] COIN: Correlation Index-Based Similarity Measure for Clustering Categorical Data
    Sowmiya, N.
    Gupta, N.Srinivasa
    Natarajan, Elango
    Valarmathi, B.
    Elamvazuthi, I.
    Parasuraman, S.
    Kit, Chun Ang
    Freitas, Lídio Inácio
    Abraham Gnanamuthu, Ezra Morris
    Mathematical Problems in Engineering, 2022, 2022