Latent generative landscapes as maps of functional diversity in protein sequence space

被引:0
|
作者
Cheyenne Ziegler
Jonathan Martin
Claude Sinner
Faruck Morcos
机构
[1] University of Texas at Dallas,Department of Biological Sciences
[2] University of Texas at Dallas,Department of Bioengineering
[3] University of Texas at Dallas,Center for Systems Biology
来源
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Variational autoencoders are unsupervised learning models with generative capabilities, when applied to protein data, they classify sequences by phylogeny and generate de novo sequences which preserve statistical properties of protein composition. While previous studies focus on clustering and generative features, here, we evaluate the underlying latent manifold in which sequence information is embedded. To investigate properties of the latent manifold, we utilize direct coupling analysis and a Potts Hamiltonian model to construct a latent generative landscape. We showcase how this landscape captures phylogenetic groupings, functional and fitness properties of several systems including Globins, β-lactamases, ion channels, and transcription factors. We provide support on how the landscape helps us understand the effects of sequence variability observed in experimental data and provides insights on directed and natural protein evolution. We propose that combining generative properties and functional predictive power of variational autoencoders and coevolutionary analysis could be beneficial in applications for protein engineering and design.
引用
收藏
相关论文
共 50 条
  • [1] Latent generative landscapes as maps of functional diversity in protein sequence space
    Ziegler, Cheyenne
    Martin, Jonathan
    Sinner, Claude
    Morcos, Faruck
    NATURE COMMUNICATIONS, 2023, 14 (01)
  • [2] GENERALIST: A latent space based generative model for protein sequence families
    Akl, Hoda
    Emison, Brooke
    Zhao, Xiaochuan
    Mondal, Arup
    Perez, Alberto
    Dixit, Purushottam D.
    PLOS COMPUTATIONAL BIOLOGY, 2023, 19 (11)
  • [3] Deciphering protein evolution and fitness landscapes with latent space models
    Ding, Xinqiang
    Zou, Zhengting
    Brooks, Charles L., III
    NATURE COMMUNICATIONS, 2019, 10 (1)
  • [4] Deciphering protein evolution and fitness landscapes with latent space models
    Xinqiang Ding
    Zhengting Zou
    Charles L. Brooks III
    Nature Communications, 10
  • [5] Exploring the Protein Sequence Space with Global Generative Models
    Romero-Romero, Sergio
    Lindner, Sebastian
    Ferruz, Noelia
    COLD SPRING HARBOR PERSPECTIVES IN BIOLOGY, 2023, 15 (11):
  • [6] EXPERIMENTAL SKETCH OF LANDSCAPES IN PROTEIN-SEQUENCE SPACE
    TRAKULNALEAMSAI, S
    YOMO, T
    YOSHIKAWA, M
    AIHARA, S
    URABE, I
    JOURNAL OF FERMENTATION AND BIOENGINEERING, 1995, 79 (02): : 107 - 118
  • [7] Compression of functional space in HLA-A sequence diversity
    Zhao, B
    Png, AEH
    Ren, EC
    Kolatkar, PR
    Mathura, VS
    Sakharkar, MK
    Kangueane, P
    HUMAN IMMUNOLOGY, 2003, 64 (07) : 718 - 728
  • [8] Optimizing the Latent Space of Generative Networks
    Bojanowski, Piotr
    Joulin, Armand
    Paz, David Lopez
    Szlam, Arthur
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [9] Comparing the latent space of generative models
    Asperti, Andrea
    Tonelli, Valerio
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (04): : 3155 - 3172
  • [10] Comparing the latent space of generative models
    Andrea Asperti
    Valerio Tonelli
    Neural Computing and Applications, 2023, 35 : 3155 - 3172