A structural study for the optimisation of functional motifs encoded in protein sequences

被引:6
|
作者
Via, A [1 ]
Helmer-Citterich, M [1 ]
机构
[1] Univ Roma Tor Vergata, Dept Biol, Ctr Mol Bioinformat, I-00173 Rome, Italy
关键词
D O I
10.1186/1471-2105-5-50
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: A large number of PROSITE patterns select false positives and/or miss known true positives. It is possible that - at least in some cases - the weak specificity and/or sensitivity of a pattern is due to the fact that one, or maybe more, functional and/or structural key residues are not represented in the pattern. Multiple sequence alignments are commonly used to build functional sequence patterns. If residues structurally conserved in proteins sharing a function cannot be aligned in a multiple sequence alignment, they are likely to be missed in a standard pattern construction procedure. Results: Here we present a new procedure aimed at improving the sensitivity and/or specificity of poorly-performing patterns. The procedure can be summarised as follows: 1. residues structurally conserved in different proteins, that are true positives for a pattern, are identified by means of a computational technique and by visual inspection. 2. the sequence positions of the structurally conserved residues falling outside the pattern are used to build extended sequence patterns. 3. the extended patterns are optimised on the SWISS-PROT database for their sensitivity and specificity. The method was applied to eight PROSITE patterns. Whenever structurally conserved residues are found in the surface region close to the pattern (seven out of eight cases), the addition of information inferred from structural analysis is shown to improve pattern selectivity and in some cases selectivity and sensitivity as well. In some of the cases considered the procedure allowed the identification of functionally interesting residues, whose biological role is also discussed. Conclusion: Our method can be applied to any type of functional motif or pattern (not only PROSITE ones) which is not able to select all and only the true positive hits and for which at least two true positive structures are available. The computational technique for the identification of structurally conserved residues is already available on request and will be soon accessible on our web server. The procedure is intended for the use of pattern database curators and of scientists interested in a specific protein family for which no specific or selective patterns are yet available.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] A structural study for the optimisation of functional motifs encoded in protein sequences
    Allegra Via
    Manuela Helmer-Citterich
    [J]. BMC Bioinformatics, 5
  • [2] False occurrences of functional motifs in protein sequences highlight evolutionary constraints
    Via, Allegra
    Gherardini, Pier Federico
    Ferraro, Enrico
    Ausiello, Gabriele
    Tomba, Gianpaolo Scalia
    Helmer-Citterich, Manuela
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)
  • [3] False occurrences of functional motifs in protein sequences highlight evolutionary constraints
    Allegra Via
    Pier Federico Gherardini
    Enrico Ferraro
    Gabriele Ausiello
    Gianpaolo Scalia Tomba
    Manuela Helmer-Citterich
    [J]. BMC Bioinformatics, 8
  • [4] Finding motifs in protein sequences
    Todd Richmond
    [J]. Genome Biology, 1 (3):
  • [5] PROTEIN SEQUENCES - HOMOLOGIES AND MOTIFS
    STERNBERG, MJE
    ISLAM, SA
    [J]. TRENDS IN BIOTECHNOLOGY, 1991, 9 (09) : 300 - 302
  • [6] Mining protein sequences for motifs
    Narasimhan, G
    Bu, CS
    Gao, YA
    Wang, XI
    Xu, N
    Mathee, K
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (05) : 707 - 720
  • [7] Functional diversity of CTCFs is encoded in their binding motifs
    Fang, Rongxin
    Wang, Chengqi
    Skogerbo, Geir
    Zhang, Zhihua
    [J]. BMC GENOMICS, 2015, 16
  • [8] Functional diversity of CTCFs is encoded in their binding motifs
    Rongxin Fang
    Chengqi Wang
    Geir Skogerbo
    Zhihua Zhang
    [J]. BMC Genomics, 16
  • [9] Finding functional motifs in protein sequences with deep learning and natural language models
    Savojardo, Castrense
    Martelli, Pier Luigi
    Casadio, Rita
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2023, 81