Towards index-based similarity search for protein structure databases

被引:12
|
作者
Çamoglu, O [1 ]
Kahveci, T [1 ]
Singh, AK [1 ]
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
关键词
protein structures; feature vectors; indexing; dataset join;
D O I
10.1109/CSB.2003.1227314
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We propose two methods for finding similarities in protein structure databases. Our techniques extract feature vectors on triplets of SSEs (Secondary Structure Elements) of proteins. These feature vectors are then indexed using a multidimensional index structure. Our first technique cone siders the problem of finding proteins similar to a given query protein in a protein dataset. This technique quickly finds promising proteins using the index structure. These proteins are then aligned to the query protein using a popular pairwise alignment tool such as VAST We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Our second technique considers the problem of joining two protein datasets to find an all-to-all similarity. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while keeping the sensitivity similar.
引用
收藏
页码:148 / 158
页数:11
相关论文
共 50 条
  • [31] Index-Based Search Scheme in Peer-to-Peer Networks
    Bo, Jin
    Zhao, Juping
    COMPUTER SCIENCE FOR ENVIRONMENTAL ENGINEERING AND ECOINFORMATICS, PT 2, 2011, 159 : 102 - 106
  • [32] COIN: Correlation Index-Based Similarity Measure for Clustering Categorical Data
    Sowmiya, N.
    Gupta, N. Srinivasa
    Natarajan, Elango
    Valarmathi, B.
    Elamvazuthi, I.
    Parasuraman, S.
    Kit, Chun Ang
    Freitas, Lidio Inacio
    Abraham Gnanamuthu, Ezra Morris
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [33] Fast similarity search for protein 3D structure databases using spatial topological patterns
    Park, SH
    Ryu, KH
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, 3180 : 771 - 780
  • [34] CM-tree: A dynamic clustered index for similarity search in metric databases
    Aronovich, Lior
    Spiegler, Israel
    DATA & KNOWLEDGE ENGINEERING, 2007, 63 (03) : 919 - 946
  • [35] An efficient similarity search based on indexing in large DNA databases
    Jeong, In-Seon
    Park, Kyoung-Wook
    Kang, Seung-Ho
    Lim, Hyeong-Seok
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2010, 34 (02) : 131 - 136
  • [36] REFBSS: Reference Based Similarity Search in Biological Network Databases
    Soylev, Arda
    Abul, Osman
    2015 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2015, : 322 - 329
  • [37] Evolutionary wavelet-based similarity search in image databases
    Xie, C
    Wei, CJ
    Xu, J
    PROCEEDINGS OF 2005 IEEE INTERNATIONAL WORKSHOP ON VLSI DESIGN AND VIDEO TECHNOLOGY, 2005, : 385 - 388
  • [38] On optimizing distance-based similarity search for biological databases
    Mao, R
    Xu, WJ
    Ramakrishnan, S
    Nuckolls, G
    Miranker, DP
    2005 IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE, PROCEEDINGS, 2005, : 351 - 361
  • [39] Multiresolution similarity search in image databases
    Heczko, M
    Hinneburg, A
    Keim, D
    Wawryniuk, M
    MULTIMEDIA SYSTEMS, 2004, 10 (01) : 28 - 40
  • [40] Similarity Search and Mining in Uncertain Databases
    Renz, Matthias
    Cheng, Reynold
    Kriegel, Hans-Peter
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (02): : 1653 - 1654