Protein sequence similarity searches using patterns as seeds

被引:236
|
作者
Zhang, Z
Schaffer, AA
Miller, W
Madden, TL
Lipman, DJ
Koonin, EV
Altschul, SF [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
[2] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[3] Natl Human Genome Res Inst, Inherited Dis Res Branch, NIH, Baltimore, MD 21224 USA
关键词
D O I
10.1093/nar/26.17.3986
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein families often are characterized by conserved sequence patterns or motifs, A researcher frequently wishes to evaluate the significance of a specific pattern within a protein, or to exploit knowledge of known motifs to aid the recognition of greatly diverged but homologous family members, To assist in these efforts, the pattern-hit initiated BLAST (PHI-BLAST) program described here takes as input both a protein sequence and a pattern of interest that it contains. PHI-BLAST searches a protein database for other instances of the input pattern, and uses those found as seeds for the construction of local alignments to the query sequence. The random distribution of PHI-BLAST alignment scores is studied analytically and empirically. In many instances, the program is able to detect statistically significant similarity between homologous proteins that are not recognizably related using traditional single-pass database search methods, PHI-BLAST is applied to the analysis of CED4-like cell death regulators, HS90-type ATPase domains, archaeal tRNA nucleotidyltransferases and archaeal homologs of DnaG-type DNA primases.
引用
收藏
页码:3986 / 3990
页数:5
相关论文
共 50 条
  • [21] Homology induction: the use of machine learning to improve sequence similarity searches
    Karwath, A
    King, RD
    [J]. BMC BIOINFORMATICS, 2002, 3 (1)
  • [22] Homology Induction: the use of machine learning to improve sequence similarity searches
    Andreas Karwath
    Ross D King
    [J]. BMC Bioinformatics, 3
  • [23] Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA
    Mrozek, Dariusz
    Brozek, Milosz
    Malysiak-Mrozek, Bozena
    [J]. JOURNAL OF MOLECULAR MODELING, 2014, 20 (02)
  • [24] TOP:: a new method for protein structure comparisons and similarity searches
    Lu, GG
    [J]. JOURNAL OF APPLIED CRYSTALLOGRAPHY, 2000, 33 : 176 - 183
  • [25] Investigating Correlation between Protein Sequence Similarity and Semantic Similarity Using Gene Ontology Annotations
    Ikram, Najmul
    Qadir, Muhammad Abdul
    Afzal, Muhammad Tanvir
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (03) : 905 - 912
  • [26] Parallel implementation of 3D protein structure similarity searches using a GPU and the CUDA
    Dariusz Mrozek
    Miłosz Brożek
    Bożena Małysiak-Mrozek
    [J]. Journal of Molecular Modeling, 2014, 20
  • [27] Parallezation Protein Sequence Similarity Algorithms using Remote Method Interface
    Mohsen, Mubarak Saif
    Zainol, Zurinahni
    Salam, Rosalina Abdul
    Husain, Wahidah
    [J]. PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 1, 2007, 1 : 67 - 70
  • [28] PARALIGN:: rapid and sensitive sequence similarity searches powered by parallel computing technology
    Sæbo, PE
    Andersen, SM
    Myrseth, J
    Laerdahl, JK
    Rognes, T
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : W535 - W539
  • [29] Sequence Similarity Networks for the Protein Universe
    Whalen, Katie
    Sadkhin, Boris
    Davidson, Daniel
    Gerlt, John
    [J]. FASEB JOURNAL, 2015, 29
  • [30] Visualizing sequence similarity of protein families
    Veeramachaneni, V
    Makatowski, W
    [J]. GENOME RESEARCH, 2004, 14 (06) : 1160 - 1169