More than just pattern recognition: Prediction of uncommon protein structure features by AI methods

被引:4
|
作者
Herzberg, Osnat [1 ,2 ]
Moult, John [1 ,3 ]
机构
[1] Univ Maryland, Inst Biosci & Biotechnol Res, Rockville, MD 20850 USA
[2] Univ Maryland, Chem & Biochem Dept, Chem Bldg, College Pk, MD 20742 USA
[3] Univ Maryland, Dept Cell Biol & Mol Genet, Microbiol Bldg, College Pk, MD 20742 USA
关键词
CASP14; alphaFold2; AI; structure analysis; SECONDARY STRUCTURE; HELIX; CONFORMATIONS; RESIDUES; BULGES;
D O I
10.1073/pnas.2221745120
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The CASP14 experiment demonstrated the extraordinary structure modeling capabilities of artificial intelligence (AI) methods. That result has ignited a fierce debate about what these methods are actually doing. One of the criticisms has been that the AI does not have any sense of the underlying physics but is merely performing pattern recognition. Here, we address that issue by analyzing the extent to which the methods identify rare structural motifs. The rationale underlying the approach is that a pattern recognition machine tends to choose the more frequently occurring motifs, whereas some sense of subtle energetic factors is required to choose infrequently occurring ones. To reduce the possibility of bias from related experimental structures and to minimize the effect of experimental errors, we examined only CASP14 target protein crystal structures determined to a resolution limit better than 2 & ANGS;, which lacked significant amino acid sequence homology to proteins of known structure. In those experimental structures and in the corresponding models, we track cis peptides, p-helices, 310- helices, and other small 3D motifs that occur in the PDB database at a frequency of lower than 1% of total amino acid residues. The best-performing AI method, AlphaFold2, captured these uncommon structural elements exquisitely well. All discrepancies appeared to be a consequence of crystal environment effects. We propose that the neural network learned a protein structure potential of mean force, enabling it to correctly identify situations where unusual structural features represent the lowest local free energy because of subtle influences from the atomic environment.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] RECOGNITION - MORE THAN JUST ACKNOWLEDGMENT
    GOLDSTEIN, IM
    ARTHRITIS AND RHEUMATISM, 1989, 32 (10): : 1193 - 1196
  • [2] More than just Frequency? Demasking Unsupervised Hypernymy Prediction Methods
    Bott, Thomas
    Schlechtweg, Dominik
    Walde, Sabine Schulte Im
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 186 - 192
  • [3] Pattern recognition methods for protein functional site prediction
    Yang, ZR
    Wang, LP
    Young, N
    Trudgian, D
    Chou, KC
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2005, 6 (05) : 479 - 491
  • [4] More than just a protein folder
    Tracy Smith
    Nature Structural Biology, 2001, 8 : 108 - 108
  • [5] More than just a protein folder
    Smith, T
    NATURE STRUCTURAL BIOLOGY, 2001, 8 (02) : 108 - 108
  • [6] Facial Recognition: More Than Just a Phone Problem
    Wild, Bridget M.
    Kornfeld, Benjamin
    PEDIATRIC ANNALS, 2021, 50 (02): : E52 - E54
  • [7] PROTEIN DISULFIDES - MORE THAN JUST BONDS
    TATU, U
    CURRENT SCIENCE, 1990, 59 (16): : 777 - 778
  • [8] Protein aggregation: more than just fibrils
    Krebs, Mark R. H.
    Domike, Kristin R.
    Donald, Athene M.
    BIOCHEMICAL SOCIETY TRANSACTIONS, 2009, 37 : 682 - 686
  • [9] GENETICS OF OSTEOARTHRITIS: MORE THAN JUST STRUCTURE
    Valdes, Ana
    RHEUMATOLOGY, 2011, 50 : 13 - 13
  • [10] Prediction of protein structure and AI
    Ohno, Shiho
    Manabe, Noriyoshi
    Yamaguchi, Yoshiki
    JOURNAL OF HUMAN GENETICS, 2024, 69 (10) : 477 - 480