A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions

被引:5
|
作者
Kinjo, Akira R. [1 ]
机构
[1] Osaka Univ, Inst Prot Res, 3-2 Yamadaoka, Suita, Osaka 5650871, Japan
关键词
long-range interactions; short-range interactions; molecular evolution; protein structure; sequence conservation;
D O I
10.2142/biophysico.13.0_45
中图分类号
Q6 [生物物理学];
学科分类号
071011 ;
摘要
The multiple sequence alignment (MSA) of a protein family provides a wealth of information in terms of the conservation pattern of amino acid residues not only at each alignment site but also between distant sites. In order to statistically model the MSA incorporating both short-range and long-range correlations as well as insertions, I have derived a lattice gas model of the MSA based on the principle of maximum entropy. The partition function, obtained by the transfer matrix method with a mean-field approximation, accounts for all possible alignments with all possible sequences. The model parameters for short-range and long-range interactions were determined by a self-consistent condition and by a Gaussian approximation, respectively. Using this model with and without long-range interactions, I analyzed the globin and V-set domains by increasing the "temperature" and by "mutating" a site. The correlations between residue conservation and various measures of the system's stability indicate that the long-range interactions make the conservation pattern more specific to the structure, and increasingly stabilize better conserved residues.
引用
收藏
页码:45 / 62
页数:18
相关论文
共 50 条
  • [1] Integrating protein secondary structure prediction and multiple sequence alignment
    Simossis, VA
    Heringa, J
    [J]. CURRENT PROTEIN & PEPTIDE SCIENCE, 2004, 5 (04) : 249 - 266
  • [2] Multiple protein sequence alignment
    Pei, Jimin
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 2008, 18 (03) : 382 - 386
  • [3] Evaluating Statistical Multiple Sequence Alignment in Comparison to Other Alignment Methods on Protein Data Sets
    Nute, Michael
    Saleh, Ehsan
    Warnow, Tandy
    [J]. SYSTEMATIC BIOLOGY, 2019, 68 (03) : 396 - 411
  • [4] Influence of multiple-sequence-alignment depth on Potts statistical models of protein covariation
    Haldane, Allan
    Levy, Ronald M.
    [J]. PHYSICAL REVIEW E, 2019, 99 (03)
  • [5] Scaling statistical multiple sequence alignment to large datasets
    Nute, Michael
    Warnow, Tandy
    [J]. BMC GENOMICS, 2016, 17
  • [6] Scaling statistical multiple sequence alignment to large datasets
    Michael Nute
    Tandy Warnow
    [J]. BMC Genomics, 17
  • [7] Small-coupling expansion for multiple sequence alignment
    Budzynski, Louise
    Pagnani, Andrea
    [J]. PHYSICAL REVIEW E, 2023, 107 (04)
  • [8] Using Multiple Sequence Alignment and Statistical Language Model to Integrate Multiple Chinese Address Recognition Outputs
    Chen, Shengchang
    Lu, Shujing
    Wen, Ying
    Lu, Yue
    [J]. 2015 13TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2015, : 151 - 155
  • [9] IMPROVEMENTS TO A MULTIPLE PROTEIN SEQUENCE ALIGNMENT TOOL
    Almeida, Andre Atanasio M.
    Dias, Zanoni
    [J]. BIOINFORMATICS: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOINFORMATICS MODELS, METHODS AND ALGORITHMS, 2012, : 226 - 233
  • [10] Liquid-theory analogy of direct-coupling analysis of multiple-sequence alignment and its implications for protein structure prediction
    Kinjo, Akira R.
    [J]. BIOPHYSICS AND PHYSICOBIOLOGY, 2015, 12 : 117 - 119